Helsingin yliopisto Tietojenkäsittelytieteen laitos
 

Tietojenkäsittelytieteen laitos

Tietoa laitoksesta:

 

Guest Lecture Series: Calendar of Purely Scientific Events at the Department

December 19th 12.00, Unioninkatu 34, University Main Building, Auditorium XII
Public defence of a doctoral thesis: "Provision of Quality of Service in IP-based Mobile Access Networks"
Ph.Lic Jukka Manner

As the Internet is becoming increasingly popular people expect to get wireless broadband Internet connectivity from their mobile terminals, laptop computers, Personal Digital Assistants, and mobile phones. New services and applications are introduced constantly, for example, multimedia applications and electronic commerce. These new applications often require a service that is better than the default best-effort service offered by IP-based networks. The packet handling must be predictable and prompt, which implies a requirement for Quality of Service (QoS). Moreover, the movement of the mobile terminal must not disrupt the allocation of QoS.

Currently, there are a huge number of different QoS and mobility mechanisms for IP networks. Most of the QoS technologies are designed for fixed networks and work inefficiently in mobile environments. The various mobility management mechanisms have been designed to solely handle the mobility of nodes. They often cause serious problems to QoS signaling. Moreover, the security mechanisms currently available are not optimized for mobile environments, where mobile nodes may frequently change their point of attachment to the network.

This thesis starts by analyzing existing technologies for providing mobility management and QoS in IP-based mobile access networks. Based on the analysis a new IP-based network architecture and the associated protocols is presented. The design is based on existing protocols defined by the Internet Engineering Task Force, and on protocols studied within research projects. Mechanisms to support mobility and QoS have been enhanced and coupled to offer smoother and more efficient mobility. As one of the enhancements, the thesis presents a novel extension that enables the use of RSVP for network internal signaling when end-to-end QoS is not available. The design has been analyzed on a Linux-based access network with wireless LAN links. The conclusions highlight additional issues for future research.

Welcome!
December 8th 12.15, Teollisuuskatu 23, room A217
Guest lecture: The Current truth about multiway heaps
Dosentti Jyrki Katajainen

This talk is about the heaps we all love. I will explain how the heap functions are implemented in the CPH STL program library. The main contribution of the work done by collaborators and myself is an experimental evaluation of various heap variants proposed in the computing literature. We have also done micro-benchmarking which gives some directions for future research.

(Joint work with Claus Jensen and Fabio Vitale)

Welcome!
Marraskuun 27. päivä klo 15-17, Teollisuuskatu 23, luokka A217
Vierailuluento: Äänenkorkeuksien mittaaminen polyfonisesta musiikista
Tutkija, DI Anssi Klapuri

Luento käsittelee äänenkorkeuksien mittaamista moniäänisessä musiikissa. Tämä on yksi musiikin automaattisen nuotintamisen ydinongelmia. Tarkoituksena on esitellä eri lähestymistapoja ongelman ratkaisemiseksi ja kertoa Tampereen teknillisellä yliopistolla aiheesta tehdystä tutkimuksesta. Luento sisältää nuotinnosesimerkkejä.

Tervetuloa!
November 15th, Unioninkatu 34, University Main Building, Auditorium XIV
Public defence of a doctoral thesis: "A process algebraic reduction strategy for automata theoretic verification of untimed and timed concurrent systems"
Ph.Lic Matti Luukkainen

Automatic verification based on state space exploration has proven to be a useful method in analysis of concurrent systems. The limit to the practical usability of the method is set by the extensive size of the state space of realistic systems.

In this thesis we study the application of reductions based on process algebraic equivalences as a method to alleviate the "state explosion" phenomenon in automata theoretic verification. Both the system to be verified and the correctness criteria are modeled as Büchi-automata, the theory of which needs some extension for this purpose. We extend this approach to verification of real time systems described with timed automata which in addition to control states and integer variables also contain variables to record the progress of time. Usage of real valued time variables implies that the state space of a timed automaton is inherently infinite.

We show how the process algebraic compositional verification methods can be employed in the analysis of systems composed of timed automata by discretizing state spaces. This enables us to verify most of the properties when timed automata conform to a syntactic criterion, called weak restriction, meaning that its clock variables are compared with rational time constants by using only the comparison operators "smaller or equal than", "equal" ar "greater or than". Also some automata not falling into this category can be verified with the method. The usability of the method is evaluated with three case studies.

Welcome!
November 13th 4.15pm, Teollisuuskatu 23, room A414
Guest lecture: Peer advertising, discovery and inquiry
Dr. Titos Saridakis, Peer-to-Peer program manager at the NOKIA Research Center

Searching for the right peer to contact in order to perform a designated communication is, in principle, similar to searching for content in databases and searching for services on the internet. This talk is composed of three parts:

  • a review of the constituents of the search process, i.e. advertising, discovering and inquiring peers,
  • an overview of how different P2P protocols have approached this issue, and
  • a note on the P2P network parameters (scale, impromptu presence, security concerns) that influence the efficiency of different peer-search schemes

Welcome!
November 6th 3pm, Teollisuuskatu 23, room A217
Guest lecture: Research of Musical Instrument Models at Helsinki University of Technology
Professor Vesa Välimäki, Helsinki University of Technology

The lecture will focus on sound synthesis of musical instruments using modeling techniques. The Laboratory of Acoustics and Audio Signal Processing at Helsinki University of Technology has investigated this topic for over 10 years. The presentation will outline the basic ideas of sound synthesis methods for musical applications. The synthesis of a few instruments, such as the acoustic guitar, the piano, and the clavichord, will be covered quickly. The lectures will contain audio examples of synthetic musical instrument sounds and computer music.

Welcome!
November 3rd 2.15 pm, Teollisuuskatu 23, room A216
Guest lecture: Answering Queries using Views
Professor Foto Afrati, National Technical University of Athens

In various applications we need to access information that is stored in different schemas and use then it to answer queries. For this problem, we either store the data that we collect in a data warehouse or use mediators as an interface between the user and the various sources. In both cases, we may think of the various sources as views (either materialized or not). In the talk, we shall address the problem of how to answer queries using such views.

Welcome!
October 16th 12.00, Teollisuuskatu 23, room A414
Guest lecture: Sream and graph data mining
Professor Christos Faloutsos from Carnegie Mellon University

What patterns can we find in a bursty web traffic? On the web or internet graph itself? How about the distributions of galaxies in the sky, or the distribution of a company's customers in geographical space? How long should we expect a nearest-neighbor search to take, when there are 100 attributes per patient or customer record? The traditional assumptions (uniformity, independence, Poisson arrivals, Gaussian distributions), often fail miserably. Should we give up trying to find patterns in such settings?

Self-similarity, fractals and power laws are extremely successful in describing real datasets (coast-lines, rivers basins, stock-prices, brain-surfaces, communication-line noise, to name a few). We show some old and new successes, involving modeling of graph topologies (internet, web and social networks); modeling galaxy and video data; dimensionality reduction; and more.

Welcome!
October 16th 10.15, Teollisuuskatu 23, room A216
Guest lecture: The Post-PC Era: It's About the Services
Professor Randy Katz from University of California Berkeley

The Post-PC Era is often viewed as driven by the proliferation of new kinds of information appliances. We take a different viewpoint: the Post-PC Era will be shaped by the ability to manage computation and storage deep inside the network, connected by application-specific overlay networks, all on behalf of end user applications. This is what we call "services." Examples include web caches, content delivery redistribution, and transformational proxies. The result is a dramatic shift from traditional network research, on topics such as Quality of Service routing, to new distributed computing opportunities, such as network performance-aware service placement.

Welcome!
September 10th 10.15, Teollisuuskatu 23, room A217
Guest lecture: Enumerating Maximal Frequent Sets using Irredundant Dualization
Professor Ken Satoh from National Institute of Informatics, Japan

In this paper, we give a new algorithm for enumerating all maximal frequent sets using dualization. Frequent sets in transaction data has been used for computing association rules. Maximal frequent sets are important in representing frequent sets in a compact form, thus many researchers have proposed algorithms for enumerating maximal frequent sets. Among these algorithms, some researchers proposed algorithms for enumerating both maximal frequent sets and minimal infrequent sets in a primal-dual way by using a computation of the minimal transversal for a hypergraph, or in other words, hypergraph dualization. We give an improvement for this kind of algorithms in terms of the number of queries of frequency and the space complexity. Our algorithm checks each minimal infrequent set just once, while the existing algorithms check more than once, possibly so many times. Our algorithm does not store the minimal infrequent sets in memory, while the existing algorithms have to store them. The main idea of the improvement is that minimal infrequent sets computed from maximal frequent sets by dualization is still minimal infrequent even if we add a set to the current maximal frequent sets. We analyze the query complexity and the space complexity of our algorithm theoretically, and show that the query complexity is |Bd-|+|Bd+|*|P| and experimentally evaluate the algorithm to show that the computation time on average is in the order of |Bd-|*|Bd+| where |Bd+| is the number of maximal frequent sets, |Bd-| the number of minimal infrequent sets, and |P| is the number of items.

Welcome!
September 4th at 14.15, Teollisuuskatu 23, room A216
Guest lecture: Desinging Learning Objects: A Semiotic Model
Dr. Pithamber R. Polsani, Faculty Fellow at the Learning Technologies Center, University of Arizona at Tucson

In last eight years the terms learning object and reusable learning object have become the Holy Grail of content creation and aggregation in the computer-mediated learning field. Both the terms are frequently employed in uncritical ways, thereby reducing them to mere slogans. The serious lack of conceptual clarity and reflection is evident in the multitude of definitions and uses of learning objects. The objective of this talk is to fill this critical gap by offering a conceptual definition that takes learning and reusability as the foundational principles. Furthermore, a model for creating and using meaningful learning objects will be developed. In this regard American philosopher Charles Sanders Peirce's semiotics will provide the theoretical basis.

Welcome!
August 21st at 14.15, Teollisuuskatu 23, room A216
Guest lecture: Alignment of Tandemly Repeated Sequences and Applications
Dr. Eric Rivals (Montpellier)

In the class of repeated sequences that occur in DNA, minisatellites have been found polymorphic and became useful tools in genetic mapping and forensic studies. They consist of a heterogeneous tandem array of a short repeat unit. The slightly different units along the array are called variants. Minisatellites evolve mainly through tandem duplications and tandem deletions of variants. Jeffreys et al. devised a method to obtain the sequence of variants along the array in a digital code, and called such sequences maps. Minisatellite maps give access to the detail of mutation processes at work on such loci. In this paper, we design an algorithm to compare two maps under an evolutionary model that includes deletion, insertion, mutation, tandem duplication and tandem deletion of a variant. Our method computes an optimal alignment in reasonable time; and the alignment score, i.e., the weighted sum of its elementary operations, is a distance metric between maps. The main difficulty is that the optimal sequence of operations depends on the order in which they are applied to the map. Taking the maps of the minisatellite MSY1 of 609 men, we computed all pairwise distances and reconstruct an evolutionary trees for groups of these individuals. MSY1 (DYF155S1) is a hypervariable locus on the Y chromosome. We recover known relationships between independently defined haplogroups, showing that one can decipher a micro-evolutionary signal using minisatellite maps comparison. Work done in collaboration with S. Bérard (LIRMM)

Welcome!
August 20th at 14.15, Teollisuuskatu 23, room A217
Guest lecture: A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices
Prof. Gad M. Landau (Haifa)

The classical algorithm for computing the similarity between two sequences uses a dynamic programming matrix, and compares two strings of size $n$ in $O(n2)$ time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both {\it local} and {\it global} similarity computations.

The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an $O(n2 / \log n)$ algorithm for an input of constant alphabet size. For most texts, the time complexity is actually $O(h n2 / \log n)$ where $h \le 1$ is the entropy of the text.

We also present an algorithm for comparing two {\it run-length} encoded strings of length $m$ and $n$, compressed into $m'$ and $n'$ runs respectively, in $O(m'n + n'm)$ complexity. This result extends to all distance or similarity scoring schemes which use an additive gap penalty.

Welcome!
August 19th 12.00, Unioninkatu 34, University Main Building, Auditorium XIII
Public defence of a doctoral thesis: Parameterized Approximate String Matching and Local-Similarity-Based Point-Pattern Matching
MSc Veli Mäkinen

This thesis studies matching problems for different one- and two-dimensional objects like numerical sequences representing music, one-dimensional point-patterns representing tree-ring series, and two-dimensional point-patterns representing spots in protein electrophoresis experiments.

The one-dimensional problems studied here are generalizations of the classical string edit distance. The goal has been to find as general and efficient algorithms as possible for different variations of the classical problem. While most of the variants have motivations in applications, the study is more general; as an example, a wide spectrum of gradually more difficult matching problems under the label "restricted gaps" is covered. The basic scheme behind the efficient solutions is the use of geometric data structures to speed up dynamic programming. Especially semi-dynamic range minimum queries are shown to be useful in this context.

One variation of the classical edit distance measure studied here is the transposition-invariant edit distance for numerical strings. A music piece can be modeled as a sequence of pitch levels, thus giving a numerical string. If the same music piece is recorded in different tones, a straightforward comparison between these two pitch sequences does not reveal the obvious similarity. A distance that measures the dissimilarity between two music pieces should therefore be transposition-invariant, i.e. it should allow an arbitrary shift in the tones without any cost. We show the perhaps surprising result that including transposition-invariance into edit distance only adds some poly-logarithmic multiplicative factors into the known upper bounds for the original problem.

Another variation related to music information retrieval is matching of run-length encoded strings. The problems in this class take both the pitch and the duration information into account when comparing two music pieces. We give efficient algorithms for matching problems in this class.

The concept of local-similarity-based matching is introduced in the context of point-pattern matching. The one-dimensional case has an application in matching tree-ring sequences in dendrochronology. A number of algorithms are developed for this problem. The two-dimensional case is more difficult. Consider two point sets A and B of equal cardinality in two-dimensional Euclidean space. The task is to find a one-to-one matching between these point sets such that local similarity is preserved in the matching, i.e., the matching of a point is consistent with the matchings of its neighboring points. A distance measure framework for measuring the consistency of a matching is introduced and the problem of finding the best matching under this framework is studied. It is shown that several natural instances of this framework lead to problems that are computationally infeasible; they are NP-hard to approximate within any constant factor. A relaxed model for the problem is also studied leading to a heuristic solution based on minimum weight matching. Experiments with point-patterns extracted from two-dimensional protein electrophoresis gels show that the heuristic works reasonably well.

Welcome!
June 12th, 2003 at 10.15-12.00 in room A216
Guest Lecture: Discovering Best Variable-Length-Don't-Care Patterns
Dr. Shunsuke Inenaga from Kyushu University, Japan

A variable length don't care symbol * is a wildcard that matches any string, and a pattern that contains *'s is called a variable-length- don't-care pattern (VLDC pattern). An example of a VLDC pattern could be ab*bb*a. A VLDC pattern p is said to match a string w if w is obtained by replacing the *'s in p with some strings. For instance, VLDC pattern ab*bb*a matches string ababbaba with the first and second *'s replaced by a and ab, respectively.

The problem of finding a rule that separates two given datasets, often regarded as positive examples and negative examples, is a central task in knowledge discovery. In the context of string, rules are patterns. Hereby we consider the problem of discovering the VLDC pattern that is the most common to one of two given sets of strings, and the least common to the other. Although this problem is known to be NP-hard, our algorithm finds such best VLDC patterns exactly, powerfully sped up by pruning heuristics. Another feature of our algorithm is fast pattern matching techniques, for which we have two options: pattern matching machines (PPMs) and an index structure called the wildcard directed acyclic word graphs (WDAWGs).

Then, we also consider a more generalized problem of finding the best pair , where p is a VLDC pattern and k is the window size that bounds the length of an occurrence of p in a string w. We present three algorithms solving this problem with pruning heuristics, using the dynamic programming (DP), PMMs and WDAWGs, respectively. Our experimentation has shown that all of our algorithms run in reasonable amount of time, and that VLDC patterns are suitable for finding a good rule to separate two given sets of strings.

More information

June 12-13th, 2003
Course at HeCSE Summer School: Basic concepts of object-oriented and component-based software engineering
Professor Joost Kok (Leiden University)

More information

June 2nd 12.00, Unioninkatu 34, University Main Building, Auditorium XII
Public defence of a doctoral thesis: Considering Individual Differences in Computer-Supported Special and Elementary Education
Ph.Lic Jaakko Kurhila

Special education, particularly the education of disabled children, suffers from the lack of computer-science-oriented research and moderate computer expertise in educational software used in classrooms. Special education provides a particularly challenging research area to computers and education, since every learner is unique. Successful learning systems for special education can benefit from three distinctive properties: adaptation to the individual learning process, domain independence in the learning content with ease-of-authoring, and support for special needs. The thesis presents a model, called a learning space model, as a basis for a learning system that tries to address these three issues. The model is based on structuring the learning material in a $n$-dimensional vector space. The author of the material can specify the dimensions used. The primary target group for the learning space model is children with deficiencies in mental programming. When simplified, mental programming means the ability to compose a problem solving strategy, fluency in solving various tasks, and the ability to uphold attention and motivation. Although deficiencies in mental programming are most severe with brain damage or occur often with developmental disabilities, it is clear that these deficiencies are present to some extent in every one of us. Therefore, the learning space model is taken out into two classrooms and tested empirically. The first test is in a special education setting, but the transfer to non-disabled education is tested in elementary education. The findings from these two case studies imply that the model operates as expected if the learning material is authored carefully. Lastly, the properties of the model are inspected formally to understand the limitations, challenges and potential of the model better.

Welcome!
May 24th 10.00, Unioninkatu 40, Metsätalo, hall 2
Public defence of a doctoral thesis: Design and Analysis of a Distributed Database Architecture for IN/GSM Data
Ph.Lic Juha Taina

Databases both in the Intelligent Networks (IN) and in the Global System for Mobile communications (GSM) have gained growing interest in the last few years. When the data volume increases, more efficient database solutions are needed to support temporal and logical consistency. In this thesis we give necessary background information to the IN and GSM architectures, do a detailed analysis of IN and GSM database data and architecture requirements, create a reference database management system architecture for IN/GSM data, define a toolbox of queueing model formulas for analyzing transaction-based systems, and do a detailed analysis of the reference database management system using our toolbox.

The data requirements for an IN/GSM database are based on the ITU-T recommendations in the Q.12xx series. The GSM analysis is based on the GSM standard. Together we use this information to define our multi-level distributed database management system for current and future Intelligent Networks.

Our database architecture is based on distribution and parallelism. It defines entry points for both X.500 based requests and for internal database language requests. It is scalable, and allows specialized node-level architectures for special telecommunications services.

Using the architecture as a test system, we give a detailed queueing model analysis of it. The result of the analysis is twofold. First, the most important result is that even a complex transaction-based system may be analyzed accurately enough to get relevant and reasonable results with our toolbox. Second, in IN and GSM environments, parallelism within a database is a more important aspect than distribution. Distribution has its use in a telecommunications database, for instance to minimize physical distance from a client to the database. The special nature of IN and GSM data allows parallel database architectures that are extremely scalable.

Welcome!
Thursday May 8th at 16.15, Teollisuuskatu 23, room A320
Some combinatorial algorithms for molecular biology
Dr. Marie-France Sagot (INRIA Rhone-Alpes, Lyon)

Abstract:
This talk will present an overview of the work that has been done by the (sometimes official, sometimes unofficial) group of persons that has been working with the speaker over the years and across continents. This covers various algorithms, most often combinatorial, for addressing some of the currently important problems in molecular biology that are amenable to a treatment by mathematics and algorithmics.

The subjects that have been and/or are currently being treated concern the inference of motifs potentially involved in the regulation of gene expression, repeats and gene detection, RNA structure comparison and inference, rearrangement detection and localisation, computation of rearrangement distance between evolutionary trees and genomes. Since this talk is happening in the context of a starting collaboration with the University of Helsinki computer science group, various open problems or problems that have just started been looked at will be discussed. The talk will be very informal.

References may be found on the speaker's web page

More information: Academy Professor Esko Ukkonen

Welcome!
2. toukokuuta 12-14, Teollisuuskatu 23, sali A217
Tietoturvahallinto
Aaro Hallikainen, Keskusrikospoliisin tietoturvapäällikkö

Tervetuloa!

Lisätietoja: Timo Karvi

April 12th 10.00, Unioninkatu 34, auditorium XII
Public defence of a doctoral thesis: Middleware Infrastructure for Distributed Mobile Applications
MSc Stefano Campadello

One of the most exciting new fields in computer science at the beginning of this millennium is represented by Nomadic Computing. This new technology empowers the user to access typically fixed network services from any place. Mobile users access information services regardless of their physical location or movement behavior and independently of temporal factors. A boost to the Nomadic Computing comes from the improvement of wireless data technology, which began with GSM data communication and now is leading to the deployment of UMTS networks.

Thus, wireless technology and networked applications are starting to find a common path to give answers to the needs of Nomadic users. Unfortunately this merge has been mostly a collision rather than a smooth marriage. Applications were downgraded to let them fit in small devices with poor connectivity, and the results have often been disappointing.

This dissertation focuses on these topics. Firstly, a background overview introduces the challenges that mobile distributed applications face and presents an overview of the main protocols and tools existing in distributed computing. The improvement proposed in literature to address the described challenges are also discussed. Then Java RMI is taken as an example and its problems in a wireless context are analyzed. We propose several improvements to it, we show a prototype implementation giving protocol and messaging details and we give a complete performance evaluation. Secondly, we propose a new approach to Nomadic Computing. A new paradigm shift is suggested, where it is no longer the user who has to adapt to the different scenarios that he may find during his ``nomadism'', but it is the service that modifies itself to adapt to the current situation. This new way to design services needs a totally new infrastructure, and brings many new research challenges. We describe them, and following the results of the first part of the dissertation, we suggest an architecture to address them.

Welcome!
Tuesday March 25th 14.15, Teollisuuskatu 23, room A318
Advance reservations in computer networks
Researcher Lars Burchard, Technical University of Berlin

Abstract:
Currently, computer networks only support reservations made in a just-in-time manner. This means, reservations can only be made directly before the corresponding transmission is about to start. However, in many practical applications the parameters of the transmission such as start, stop time, and bandwidth requirement are known a long time in advance. Examples are grid computing environments, where a number of computer systems are connected and can be used together as a virtual single computer. The current grid toolkits support the allocation of computing resources, i.e. processors, in advance. However, the also required support for communication quality-of-service, e.g. to transfer the usually large amounts of data between the different machines involved, cannot be given in the same way. In this talk, a framework based on current network technology such as MPLS and DiffServ will be presented, that fills this gap, i.e. allows the allocation of network bandwidth in advance. Furthermore, a number of problems and opportunities arising in such environments are pointed out together with approaches to solve them and use the properties of the environment to improve the networks performance.

Welcome!
March 21st 12.15, Teollisuuskatu 23, room A319
CCC: A Cluster Management System
Researcher Lars Burchard, Technical University of Berlin

Abstract:
The different requirements and expectations of users on a cluster computer require to provide more than a simple ssh/rsh access to the single nodes, in particular when large numbers of users compete for scarce resources. User and time management functionality is inevitable to ensure an efficiently utilized system and provide the required support for users and applications. In this talk, the management system CCS (Computing Center Software) will be presented, which provides the basic functionality for managing users and jobs (processes) on a linux cluster. Beyond the basic functionality of providing access to the different nodes of the cluster, the CCS allows both interactive and batch access, provides monitoring and accounting functionality and deals with node failures. The architecture of CCS as well as the functionality of the system for both users and operators of a cluster will be described.

Welcome!
6. maaliskuuta 16-19, Teollisuuskatu 23, room A217
Esitelmiä suomalaisen ohjelmistoliiketoiminnan kansainvälistymisestä
Helsingin yliopiston tietojenkäsittelytieteen laitoksen ohjelmistoliiketoimintakurssin kutsuesitelmäsarjan päätöstilaisuuden aiheena on kansainvälistyminen. Tilaisuudessa kuullaan suomalaisten ohjelmistoyritysten menestyksistä ja karvaista kokemuksista USA:ssa, Euroopassa ja Aasiassa. Esiityjinä on suomalaisen ohjelmistoliiketoiminnan pioneereja ja huippunimiä.

klo 16-17 Toimitusjohtaja, PhD Matti Hämäläinen Case: Codetoys Oy

klo 17-18 Hallituksen jäsen Seppo Ruotsalainen Case: Tekla Oyj

klo 18-19 Toimitusjohtaja Mikael Roos Case: Scandinavian Softline Oy ja Malibu Telecom Oy

Tervetuloa!

Lisätietoja: Eero Hyvönen

26. helmikuuta 8.15-9.00, Teollisuuskatu 23, room A414
Verkostoitumisteknologioita ja niiden sovelluksia
FT Juha Puustjärvi, TKK

Tiivistelmä:
Organisaatiot ja niiden toimintatavat kehittyvät uusien teknologioiden myötä. Tälläinen kehitysprosessi on kaksisuuntainen: uusi teknologia mahdollistaa uusia toimintatapoja ja toisaalta uudet toimintatavat tarvitsevat uusia teknologioita. Hyvä esimerkki tällaisesta teknologioiden vuorovaikutuksesta on Internet. Sen nopea yleistyminen on mahdollistanut uusia toimintamalleja useille alueille, kuten esim. kaupankäyntiin, hallintoon, opiskeluun ja julkaisutoimintaan. On syntynyt toimialakohtaisia verkostoja, joihin usein viitataan käsitteillä e-business, e-governement, e-learning ja e-publishing. Tällaisten sovellusten kannalta web voidaan nähdä eräänlaisena verkkotietokoneena, jossa useisiin perinteisiin tietoteknisiin käsitteisiin, kuten esim. kyselyihin, hakemistoihin, mallintamiseen ja transaktioihin, on kehitetty uusia teknologioita. Esitelmässä tarkastellaan tällaisia uusia teknologioita ja niihin perustuvia sovelluksia.

Tervetuloa!

Lisätietoja: Seppo Sippu

February 18th 14.15-15.00, Teollisuuskatu 23, room A319
Linear Diophantine Equations and Array Data Dependencies: Analysis of Loops in Programs
Dmitri G. Korzoun, University of Petrozavodsk

Abstract:
Use of linear Diophantine equations has aroused increasing interest in computer science both from theoretical points of view (computational aspects) and from practical points of view. An interesting area for practical applications of Diophantine equations is program analysis.

It is well known that some main obstacles to program performance are buried in loops. To improve the performance of loops various program transformations can be attempted; these include loop parallelization, vectorization, optimization of memory access, and efficient cache utilization.

However, such transformations have to preserve all array-data dependencies which exist in the loops; if not then the semantics of the resulting program can differ from the intended. For example, the order in which the array elements are computed may be decisive for the final result: some array elements must be computed before the others.

Linear Diophantine equations can be used for implicit description and for exact discovering of array-data dependencies within loops. This knowledge then allows to perform the program transformation correctly. The key advantage of this method is that it allows an automatization of the dependence analysis and an implementation as a part of a compiler.

The talk contains an introduction to the area.We shall first present a short overview of linear Diophantine equations. Then we consider some examples of their use for the dependence analysis. We shall also discuss some complexity problems of this approach and potential ways to avoid them.

Welcome!
February 17th 12-14, Teollisuuskatu 23, huone A414
Are intelligent agents too complex to engineer for business deployment?
Dr. Stefan Poslad Department of Electronic Engineering, Queen Mary, University of London

Abstract:
An intelligent agent is software capable of flexible, autonomous action situated in some environment. Flexible action can be interpreted as exhibiting reactive, pro-active and social abilities. Social ability is often regarded as the core property that forms the intelligence of such agents. This type of social intelligent agent can share knowledge, negotiate on shared plans and tasks and cooperate or compete with other agents. They have been demonstrated as being applicable in a broad range of industrial domains including, agriculture, transport systems, telecommunications, banking and shopping. However they are not yet widely used by industry. Some of the main barriers to industrial deployment include: the complexity of the agent paradigm with respect to software engineering development, installation and maintenance of agent services and the integration of agent and non-agent services. In this presentation, the fundamentals of engineering agent services are reviewed and the engineering challenges for intelligent agents are discussed.

Biography:
Stefan Poslad is a lecturer in the Electronic Engineering Department at Queen Mary, University of London. He was co-ordinator for the EU CRUMPET e-tourism agent project and currently leads the technical development for the EU EDEN-IW agent project on integrating inland water-monitoring data. He is a core contributor to the Agentcities global agent test-bed project and to the iTrust thematic network on trust management. He is currently chair of the FIPA agent standards forum security work-group and is on the FIPA Board of directors. His research interests include security, trust and privacy for open distributed services; rich communication services involving the Semantic Web and software agents; Ubiquitous Computing and Human Computer Interaction.

Welcome!
February 15th 10.00, Unioninkatu 34, auditorium XIV
Public defence of a doctoral thesis: Supporting Nomadic Agent-based Applications in the FIPA Agent Architecture
Ph.Lic Heikki Helin

The progress in wireless network technologies and mobile devices changes the way in which people access services. A user may access the same services as she would using her desktop computer, but in the nomadic environment she is able to do so anywhere, at any time and even using a variety of different kinds of devices. Such an environment places new challenges on the architecture implementing the services. The dissertation deals with exploitation of software agent technology in nomadic computing.

Software agent technology has raised much enthusiasm in both research communities and commercial markets, and it is assumed to be an efficient design and implementation method for complex distributed systems. Combining nomadic computing and software agents makes it possible to create a solid basis for future nomadic applications. The main attention in the dissertation is in software agent communication in an environment where at least some part of the communication path is implemented using wireless network technology. In such environments, the characteristics of the communication path should be considered in each communication layer.

We present an agent-based architecture providing means for building adaptive applications for nomadic users. The architecture introduces an agent for monitoring and controlling a wireless link, Quality of Service (QoS) ontology, and support for an efficient way for agent communication over a wireless link. A high variety of QoS in data transmission over wireless networks creates challenges that today's Internet-based services addresses inadequately. Whereas today's applications may result in treating rapid and extreme changes in QoS as failures, in the nomadic environment these should be considered as a usual case. Therefore, the complexity of data transmission should be hidden from the nomadic user and applications, and managed by an intelligent middleware.

The essential results of the dissertation have been published in international forums and have affected substantially to international agent standards specified by Foundation for Intelligent Physical Agents (FIPA) standardisation organisation.

Welcome!
February 4th 15-17, Teollisuuskatu 23, huone A217
Finding frequent Substructures in 3D-Protein Databases
Alexander Hinneburg from the Martin-Luther-University Halle/Wittenberg

Due to the different genome projects and the improved analysis techniques the number of known protein sequences grows very fast. Despite some progress in protein X-ray analysis and other protein analysis techniques, the number of known protein 3D-structures is lower in the order of magnitudes as the number of known protein sequences. As X-ray protein analysis is very expensive, there is a strong desire to derive the 3D-structure directly from the protein sequence. The standard technique, protein homology modeling, is limited to proteins, for which exist proteins with known 3D-structures and similar sequences. For most proteins with known sequences and unknown 3D-structures homology modeling is not possible. An alternative way is to derive some knowledge about substructures. As the number of available high resolution X-ray protein structures has grown significantly over the last years, we want to perform a more comprehensive analysis of the conformational behavior of substructures, than which is done in known rotamer libraries describing small substructures of fixed lengths. In this paper we describe work in progress, which aims to find frequent substructures of different lengths and gaps between the amino acids. Additionally we derive association rules from the frequent substructures, which express the found knowledge in a similar way to logic rules.

Welcome!
January 25th, 2002, 10.00, Unioninkatu 34, auditorium XIII
Public defense of a doctoral thesis: Optimistic Concurrency Control Methods for Real-Time Database Systems
MSc Jan Lindström

Many real-world applications contain time-constrained access to data as well as access to data that has temporal validity. For example, consider a telephone switching system, network management, navigation systems, stock trading, and command and control systems. These applications require gathering data from the environment, processing information in the context of information obtained in the past, and contributing {\em timely} response. Hence, these applications need a real-time database system, i.e. database system where transactions are associated with deadlines on their completion times.

Concurrency control is one of the main issues in the studies of real-time database systems. Many real-time concurrency control methods considered in the literature are based on pessimistic two-phase locking (2PL), where transaction acquires a lock before database operation and waits for the lock if it cannot be granted. However, 2PL has some inherent problems such as the possibility of deadlocks and unpredictable blocking time. These problems appear to be serious in real-time systems since real-time transactions need to meet their timing constraints, in addition to consistency requirements.

Optimistic concurrency control methods have the attractive properties of being non-blocking and deadlock-free. These properties make them especially attractive for real-time database systems. Because conflict resolution between the transactions is delayed until a transaction is near to its completion, there will be more information available on making the conflict resolution. Optimistic methods have the problem of unnecessary restarts and heavy restart overhead because some near-to-complete transactions have to be restarted. Therefore, the major concern in designing optimistic concurrency control methods is to design methods that minimize the number of transactions to be restarted.

This thesis shows that some of the well-known previous methods include unnecessary restart problems. A method to reduce these unnecessary restarts is proposed. This method is based on selecting a commit timestamp as near to the validation time as possible and a new method to resolve conflicts by adjusting the serialization order dynamically amongst the conflicting transactions after the validation is successful. This method maintains serializability or, more precisely, strict serializability. We show that many unnecessary restarts can be avoided efficiently and avoiding unnecessary restarts is an efficient approach for improving the performance and predictability of concurrency control methods for main-memory database systems beyond the current state-of-the-art.

Additionally, methods to incorporate information about the timing constraints of transactions in the conflict resolution is proposed. We show that priority cognizance is not a viable approach for improving the performance and predictability of real-time concurrency control methods for main-memory real-time database systems. The results show that the proposed methods offer better chances for critical transactions to complete before their deadlines.

Finally, the work identifies a need for adaptive and integrated concurrency control methods in real-time database systems. Therefore, a new optimistic concurrency control method is presented where conflict resolution is based on adaptation to the current workload. This method is shown to produce correct results and was experimentally tested. The performance of the proposed method is shown to be superior to previous approaches.

The feasibility of the proposed methods have been experimentally tested using a prototype of a main-memory real-time database system for telecommunications with a telecommunication service workload. The results show that optimistic methods can be used in this kind of environment.

Additional Information

Welcome!
January 21st 14.15, Teollisuuskatu 23, huone A217
Research Directions in Music Information Retrieval Theory and System Development
Dr. Pierre-Yves ROLLAND, Associate Professor in Computer Science Universite d'Aix-Marseille III - FEA & Universite de Paris 6 (Pierre & Marie Curie)

The research area considered here is the exponentially-growing one that pertains to music information retrieval (MIR). Paradigms, techniques and systems are being developed for allowing users to retrieve music information from collections of music content, using some query-based or navigation-based paradigm. With the ongoing expansion of information highways and the associated increase in music contents available online, there are diverse contexts where MIR is, or could be, used. These range from searches for pieces in music libraries to consumer-oriented e-commerce of music. Since computational approaches began to be investigated some 15 years ago, several different directions have emerged in MIR research worldwide. This talk will concentrate on research directions that are seen as key ones. Both current and future (proposed) research themes will be covered. As will appear, MIR research is strongly interdisciplinary, encompassing such diverse scientific fields as: - computational/artificial intelligence - human-computer interaction - acoustic/digital signal processing - psychoacoustics and music psychology - musicology

Welcome!
January 20th, 2003, 14.15, Teollisuuskatu 23, huone A320
New results on computing in games
Tuomas Sandholm, Associate Professor, Carnegie Mellon University

While traditionally not the key focus in economics, computational complexity plays a key role in games. This is an exciting research area in game theory - in the intersection of computer science and economics. Beginning in 1990, I have been studying the complexity of 1) executing a mechanism (e.g., clearing an auction/exchange where combinatorial or other expressive bidding is allowed), 2) executing a strategy (and how computationally bounded agents should play), 3) determining an optimal strategy (i.e., finding the equilibrium of the game), 4) finding an appropriate payoff division (e.g., according to the core), 5) designing a mechanism, 6) manipulating a mechanism (by determining a beneficial insincere revelation), and 7) determining whose preferences should be elicited. In this talk I will overview some of our new results on topics 3-7 (papers on topics 1-7 are available at www.cs.cmu.edu/~sandholm). Specifically, I will cover parts of the following papers (joint work with Vincent Conitzer):

Complexity Results about Nash Equilibria. Technical report CMU-CS-02-135, 2002.

Complexity of Determining Nonemptiness of the Core. AAMAS-02 workshop on Distributed Constraint Reasoning. Also: technical report CMU-CS-02-137, 2002.

Complexity of Mechanism Design. 18th Conference on Uncertainty in Artificial Intelligence (UAI), August 1-4, Edmonton, Canada, 2002.

Complexity of Manipulating Elections with Few Candidates. National Conference on Artificial Intelligence (AAAI), July 28-August 1, Edmonton, Canada, 2002.

How Many Candidates are Needed to Make Elections Hard to Manipulate? Draft. (with Vincent Conitzer and Jerome Lang), 2002.

Vote Elicitation: Complexity and Strategy-Proofness. National Conference on Artificial Intelligence (AAAI), July 28-August 1, Edmonton, Canada, 2002.

Tuomas Sandholm is Associate Professor in the Computer Science Department at Carnegie Mellon University. He received the Ph.D. and M.S. degrees in computer science from the University of Massachusetts at Amherst in 1996 and 1994. He earned a Dipl. Eng. with distinction in Industrial Engineering and Management Science from the Helsinki University of Technology, Finland, in 1991. His research interests include artificial intelligence, electronic commerce, game theory, multiagent systems, auctions, exchanges, automated negotiation and contracting, coalition formation, normative models of bounded rationality, intelligent real-time systems, resource-bounded reasoning, constraint satisfaction, machine learning, networks, and combinatorial optimization. He has twelve years of experience designing and building electronic marketplaces. Several of his systems have been commercially fielded. He is also Founder, Chairman of the Board, and Chief Technology Officer of CombineNet, Inc. He has published over 130 technical papers, and received numerous academic awards including the NSF Career award in 1997 and the inaugural ACM SIGART Autonomous Agents Research Award in 2001.

Tervetuloa!
14. tammikuuta, 2003, 14.15, Teollisuuskatu 23, huone A217
Survo-ohjelmiston rakenteesta ja käyttöliittymästä
Prof.(emer.) Seppo Mustonen

Survo-ohjelmisto on lajissaan poikkeuksellinen yhden henkilön ideoima ja toteuttama ohjelmisto, jonka kehityshistoria kattaa ajallisesti merkittävän osan tietojenkäsittelyn kehitystä "nopeista laskukoneista" tutkijoiden työvälineeksi. Ohjelmisto heijastelee myös Seppo Mustosen itsenäisiä ja ohjelmistokehityksen valtavirroista poikkeaviakin käsityksiä ja arvostuksia.

Vierailuluento ei ole ainoastaan luento vaan myös jopa poleemiseksi tarkoitettu kannanotto, johon kuuntelijoilla on mahdollisuus ottaa kantaa ja esittää omia käsityksiään. Esitys on tarkoitettu kaikille ohjelmoinnista ja ohjelmistojen kehityksestä kiinnostuneille.

Tervetuloa!

Survon kotisivut

December 18th, 2002, 14.15, Teollisuuskatu 23, huone A217
DNA ASCII CODE: writing information in DNA
Dr. Masanori Arita (Computational Biology Research Center, Japan)

The time when we write information into DNA will soon come. This paper surveys the methods to design code words using DNA, ans proposes a simple code that can avoid unwanted hybridization in the presence of shift and concatenation of DNA words and their complements.

Welcome!
December 12th, 2002, 14.15, Teollisuuskatu 23, room A217
Text Mining for the Semantic Annotation of Unstructured Text Archives
Dr. Myra Spiliopoulou from Humboldt University, Germany

Information acquisition from unstructured texts is performed through textual search, while semistructured texts allow for more elaborated meta-data querying, based on semantic annotations. Thus it is of interest to semantically annotate unstructured texts and extract application-specific entities from them, so that the extracted semantics can be directly queried.

In the project DIAsDEM (funded by the German Research Foundation), we use data mining methods to construct appropriate semantic tags for the flat texts of a document archive and then annotate them in XML. We use the concepts of a semi-automatically derived mini-thesaurus as vector space, upon which we map all individual sentences of the archive. Sentences/Vectors are iteratively clustered on similarity, selecting clusters of high cardinality and homogeneity as the basis of semantic annotation. The representative concepts of the selected clusters constitute the semantic tag, with which the sentences in the cluster are annotated. These semantic tags form the basis for a (currently) flat DTD, which should be expanded into a quasi-schema through the identification of functional dependencies among the derived tags.

Preliminary experimental results show that the precision of the method is high, while the recall is affected by linguistical peculiarities of the texts. Attempts to overcome these particularities with more robust text partitioning methods are briefly discussed.

Welcome!
December 9th, 2002, 12.15, Teollisuuskatu 23, room A217
Teaching Program Design Using Jackson Workbench
Nicholas Ourusoff, Lecturer in Computer & Information Science, University of Maine at Augusta & Petrozavodsk State University

Teaching how to program independently of teaching a programming language has been recognized as a worthwhile goal in computer science pedagogy, but many have abandoned the goal as being impossible to achieve in practice. Jackson Structured Programming (JSP) is a well documented and proven program design method that is language independent. Jackson Workbench is a CASE tool that contains a "smart" Structure Editor that implements the JSP method and generates executable code in a number of programming languages, most notably, Java. In this lecture, I will demonstrate the use of JSP and the Jackson Workbench in teaching program design and argue that it is an effective and needed pedagogy to teach students from the start that programming is a software engineering discipline, and that design must be based on a model of problem structure.

Welcome!
November 29th, 2002, 12.15, Teollisuuskatu 23, auditorium
Public defense of a doctoral thesis: Pattern Discovery from Biosequences
MSc Jaak Vilo

In this thesis we have developed novel methods for analyzing biological data, the primary sequences of the DNA and proteins, the microarray based gene expression data, and other functional genomics data. The main contribution is the development of the pattern discovery algorithm SPEXS, accompanied by several practical applications for analyzing real biological problems. For performing these biological studies that integrate different types of biological data we have developed a comprehensive web-based biological data analysis environment Expression Profiler.

Biosequences, i.e., the primary sequences of DNA, RNA, and protein molecules, represent the most basic type of biological information. Features of these sequences that are reused by nature help us to understand better the basic mechanisms of gene structure, function, and regulation. The SPEXS algorithm has been developed for the discovery of the biologically relevant features that can be represented in the form of sequence patterns. SPEXS is a fast exhaustive search algorithm for the class of generalized regular patterns. This class is essentially the same as used in the PROSITE pattern database, i.e. it allows patterns to consist of fixed character positions, group character positions (ambiguities), and wildcards of variable lengths. The biological relevance of the patterns can be estimated according to several different mathematical criteria, which have to be chosen according to the application.

We have used SPEXS for the analysis of real biological problems, where we have been able to find biologically meaningful patterns in a variety of different applications. For example, we have studied gene regulation mechanisms by a systematic prediction of transcription factor binding sites or other signals in the DNA. In order to find genes that potentially share common regulatory mechanisms, we have used microarray based gene expression data for extracting sets of coexpressed genes.

We have also demonstrated that it is possible to predict the type of interaction between the G-protein coupled receptors (GPCR) and its respective G-protein, the mechanism widely used by cells for signaling pathways. That prediction, although the GPCR's have been studied for decades, primarily for their immense value for the pharmaceutical industry, had been thought to be unlikely from the primary sequence of GPCR alone.

The tools developed for various practical analysis tasks have been integrated into a web-based data mining environment Expression Profiler hosted at the European Bioinformatics Institute EBI. With the tools in Expression Profiler it is possible to analyze a range of different types of data like sequences, numerical gene expression data, functional annotations, or protein-protein interaction data, as well as to combine these analyses.

Additional Information

Welcome!
November 14th, 2002, 14.15, Teollisuuskatu 23, room A217
State Machines: Theory, Implementation and Applications
Sr. Researcher Yuri Gurevich from Microsoft Research (Redmond, WA)

The ASM computation model comes with the following thesis. For every computer system A and every level of abstraction, there is an abstract state machine B that behaves exactly like A; in particular B simulates A step-for-step. The ASM thesis may seem way too ambitious but there is ample experimental evidence to support it; see the academic ASM website http://www.eecs.umich.edu/gasm and the website http://research.microsoft.com/foundations of the group on Foundations of Software Engineering at Microsoft Research (the FSE website). Further, good parts of the thesis have been proven; see articles 141 and 157 at speaker's website http://research.microsoft.com/~gurevich. To take practical advantage of the thesis, one needs good tools. To this end, a powerful modern specification language AsmL has been and is being developed at Microsoft Research; you can download AsmL from the FSE website and play with it. AsmL is accompanied with useful testing tools. This tool suite has a growning clientele at Microsoft.

Welcome!
November 7th, 2002, 14.15, Teollisuuskatu 23, room A217
Designing Extensible Modelling Software
Dr. Andrew Thomas from Imperial College, London

The combination of modular and object orientated programming - component software - enables extensible systems to be easily built. WinBUGS uses the factory object pattern and run time linking to leave the set of statistical distributions and logical functions it "knows" about open. New statistical model components can be added to WinBUGS at any time by just implementing a new module and editing the "grammar" file.

Welcome!
November 6th, 2002, 14.15, Teollisuuskatu 23, room A217
Designing a marker set for whole genome scans
Professor Magnus Halldorssons (University of Iceland)

As part of the Icelandic Cancer Project, we have undertaken a whole genome scan for cancer predisposition genes, taking advantage of the increased linkage disequilibrium in the Icelandic population. All living cancer patients, their relatives, and controls are invited to participate, and so far over 94% have accepted the invitation. The study has been facilitated by the good genealogical records, standardized healthcare and reliable medical information available in Iceland.

Here I shall outline the goals and achievements of the project from the view of a computer scientist, focussing in particular on the bioinformatic challenge of reducing the cost of genotyping by optimizing the design of the marker set used. I shall also discuss some of the practical issues involved, including data protection, laboratory information management, and the acquisition of clinical data.

Welcome!

October 7, 2002
Guest lecture "Controlled Query Evaluation for Known Policies by Combining Lying and Refusal" at 1.00-2.30 pm at HTC Pinta Building - HIIT facilities
Prof. Dr. Joachim Biskup from University of Dortmund

Welcome!

Guest lecture "UML 2.0 Should Facilitate a Family of Languages" at 10.15 am in room A320
Keith Buddy, DSTC, Brisbane, Australia

Welcome!

October 3, 2002
Guest lecture "The Model Driven Architecture Guide" at 4.00 pm in room A414
Joaquin Miller, Financial Systems Architects, USA

Welcome!

October 1, 2002
Guest lecture "End-to-End QoS Management in Distributed Systems" at 12.00-13.00 in room A414.
Andre Barros de Sales, a PhD candicate from Paul Sabatier University in Toulouse, working with the IRIT research laboratory

Welcome!

August 29th, 2002
Guest lecture on maintaining stream statistics
Aristides Gionis from Stanford University is visiting HIIT/Basic Research Unit and the Department of Computer Science until September 30. He will give a talk on Thursday August 29 at 14-16 in lecture room A414.

The lecture considers the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. The authors refer to this model as the `sliding window' model.

Welcome!

August 19-23, 2002
ECML/PKDD-2002
13th European Conference on Machine Learning (ECML'02)
6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'02)

The 13th European Conference on Machine Learning (ECML) and the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD) will be co-located in Helsinki, Finland, in August 2002. Co-ordination of the two conferences provides ample opportunities for cross-fertilization between the two areas, and follows the success of the jointly organized ECML/PKDD-2001 in Freiburg.

More information
August 19, 2002
Prof. Hiroki Arimura (Kyushu University) klo 10:15 salissa A216 vierailuluennon
Text Mining and Pattern Discovery

Abstract: The talk is a survey of the recent activities of the text mining group at the Kyushu University.

Tervetuloa!
August 16, 2002
Vesa Ollikaisen tietojenkäsittelytieteen väitös
Matemaattis-luonnontieteellisessä tiedekunnassa esitetään perjantaina 16.8.2002 klo 12.00 päärakennuksen (Fabianinkatu 33) auditoriossa XII tarkastettavaksi FM Vesa Ollikaisen väitöskirja "Simulation techniques for disease gene localization in isolated populations". Tutkimus kuuluu tietojenkäsittelytieteen alaan. Vastaväittäjänä on dosentti Samuel Kaski (Teknillinen korkeakoulu) ja kustoksena professori Esko Ukkonen. Teos julkaistaan sarjassa "Department of Computer Science, Series of Publications A".

Vesa Ollikaisen kotisivut
August 1, 2002
Guest lecture, Dr. Bernd Fischer: "Towards Automatic Synthesis of Statistical Data Analysis Programs" 2 pm at the department auditorium
Automatic program synthesis is a formal approach to software development, in which efficiently executable programs are automatically derived from high-level specifications. It has successfully been applied to a number of domains, for example, celestial mechanics, transportation scheduling, or option pricing. In this talk I will discuss its application to machine learning, or more precisely, to statistical data analysis, and I will present the AutoBayes system currently under development at NASA Ames.

AutoBayes takes a specification in form of a statistical model, extracts a graphical model (i.e., Bayesian network) from it, and then derives code by a process called schema-based synthesis. Schemas represent generic algorithms together with their applicability conditions. Some schemas are directly derived from decomposition theorems for graphical models, while others implement generic machine-learning algorithms like EM. The schemas are applied recusively until irreducible subproblems occur which are then solved by the application of symbolic or numeric solvers.

AutoBayes has been applied to a number of textbook and application problems, including clustering (using EM), changepoint detection, and software reliability estimation. In the talk, I will discuss some examples and their derivations and demonstrate the system ''live''. I will also present recent work to assure the safety of the generated code.

AutoBayes is joint work with W. Buntine (HIIT), A. Gray (CMU), J. Schumann (RIACS/NASA Ames), M. Whalen (U Minnesota) and J. Whittle (QSS/NASA Ames).
July 1-5, 2002
Kesäkoulu/Summer School
Laitoksella järjestetään 1-5. heinäkuuta kesäkoulu aiheesta 'Wireless Internet'. Lisätietoja kesäkoulun www-sivulta.
June 26-27, 2002
Vierailuesitelmiä/Guest Lectures
Dr. Masanori Arita (Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japani) pitää laitoksellamme kaksi vierailuesitelmää.
Torstai 27.6.2002, klo 13.15 - 15.00, sali A217:

Seminar I: "Atomic Representation of Metabolism: underlying algorithms"

In this talk, the data representation of molecules and enzymatic reactions will be introduced. Specifically, the following algorithms are shown.
1. Finding cycles within a molecule and its relationship to the unique naming (or normalization) of molecular structures. 2. Subgraph matching procedure for finding an atomic mapping in enzymatic reactions. 3. Graph drawing algorithm for molecules.
Perjantai 28.6.2002, klo 9.15- 11.00, sali A217:

Seminar II: "Atomic Representation of Metabolism: E.coli analysis"

In this talk, the data preparation for E.coli and B.subtilis genome is introduced, and its preliminary analyses using some graph algorithms are shown.
Tervetuloa!
June 18th, 2002
Dr Dasin vierailuluento klo 14
Dr. Sandip Das (Indian Statistical Institut, Kolkata, India, e-mail: sandipdas@isical.ac.in) pitää tiistaina 18.6. klo 14.15 Tietojenkäsittelytieteen laitoksen salissa A217 vierailuluennon:

Simplex range searching and k nearest neighbors of a line segment in 2D

Abstract:
We propose a technique for finding k nearest and farthest neighbours of a query line segment among a set of points distributed arbitrarily on a two dimensional plane. For solving the problem, we improve simplex range searching technique in 2D. Given a set of n points, we preprocess them to create a data structure using O(n^2) time and space, which reports the number of points inside a triangular query region in O(log n) time. The points inside the triangular region can be reported in O(log^2 n + \kappa) time, where \kappa is the size of the output. Finally, this technique is used to find k nearest neighbours of a query line segment in O(log^2 n + k) time.
Tervetuloa!
June 7, 2002
Vierailuesitelmä
Prof. Joost N. Kok (Leiden Institute of Advanced Computer Science, Leidenin yliopisto, Alankomaat) pitää perjantaina 7.6. klo 15.15 tietojenkäsittelytieteen laitoksen salissa A217 vierailuesitelmän aiheesta

"Natural Computing: from Computer Science to Molecular Informatics"

Abstract:
Natural Computing is a general term referring to computing going on in nature and computing inspired by nature. In this presentation we give three concrete examples of natural computing. In the first example we introduce evolutionary algorithms in order to show how concepts form nature can be introduced into computing. As an application we look at drug design. Second we look at molecular computing in which computions are carried out on DNA molecules ("liberate computer science from computers") and apply it to two algorithmic problems. Third we give an example of the computational perspective on phenomena in nature. We look at how sciliates (single cell organisms) compute by performing a kind of string processing on DNA-molecules.
The presentation will be in a tutorial style, i.e. no previous knowledge is assumed.
March 2, 2002
VÄITÖSTILAISUUS
Antti-Pekka Tuovinen puolustaa väitöskirjaansa "Object-Oriented Engineering of Visual Languages" väitöstilaisuudessa 2.3.2002 klo 10 salissa Porthania3. Vastaväittäjänä toimii Tibor Gyimothy, University of Szeged, Hungary.

Tiivistelmä
February 19, 2002
NordU/USENIX- konferenssin osallistujia vieraana
Tiistaina 19.2. ryhmä NordU/USENIX- konferenssin osallistujia tulee vierailemaan laitoksellamme Auditoriossa klo 18.30-19.30. Muutkin ovat tervetulleita tulemaan paikalle.
January 15, 2002
FROM DATA TO KNOWLEDGE (FDK) -HUIPPUYKSIKÖN AVAUSSEMINAARI
FROM DATA TO KNOWLEDGE (FDK) -HUIPPUYKSIKÖN AVAUSSEMINAARI järjestetään

Tiistaina 15.1.2002 klo 14.00-17

Helsingin Yliopiston tietojenkäsittelytieteen laitoksen (Teollisuuskatu 23, Helsinki) salissa A414 (4.krs).

Ohjelma

14.00 Kahvitarjoilu

n. 14.15 FDK-huippuyksikön esittely, akatemiaprof. Esko Ukkonen

n. 14.45 Data-analyysin haasteet, prof. Heikki Mannila

n. 15.00 Lukutaitoa täyttä ymmärrystä vailla, prof. Helena Ahonen-Myka

n. 15.15 Koneet oppiin, prof. Tapio Elomaa

n. 15.30 Mitä yhteistä on kännykkärosvoilla ja syöpätaudeilla?, TkT Jaakko Hollmén

n. 15.45 Tiedon louhinta geenikartoituksessa, prof. Hannu Toivonen

n. 16.00- Vapaamuotoista keskustelua

Kaikki ovat lämpimästi tervetulleita!

Esko Ukkonen
November 30, 2001
Väitötilaisuus, Kimmo Fredriksson
FM Kimmo Fredriksson puolustaa väitöskirjaansa 'Rotation Invariant Template Matching' väitöstilaisuudessa perjantaina 30. 11. 2001 klo 12 salissa 6, yliopiston prakennus (uusi puoli). Vastaväittäjänä toimii professori Olli Yli-Harja Tampereen teknillisestä korkeakoulusta.
  • Tiivistelmä väitöksestä
  • November 24, 2001
    Väitöstilaisuus, Marko Salmenkivi
    FM Marko Salmenkivi puolustaa väitöskirjaansa 'Computational Methods for Intensity Models' väitöstilaisuudessa lauantaina 24. 11. 2001 klo 10 salissa Porthania III. Vastaväittäjän toimii professori Martti Juhola Tampereen yliopistosta.
  • Tiivistelmä väitöksestä
  • November 2, 2001
    Semantic Web Kick-Off in Finland -seminaari, pe 2.11., klo 9.00-16.30, Porthania, sali P3:
    Semantic Web Kick-Off in Finland -seminaarin tavoitteena on luoda yleiskatsaus viime aikoina nopeasti kehittyneeseen Semantic Web -alueeseen, sen visioihin, teknologoihin ja standardeihin, kansainväliseen ja kotimaiseen tutkimukseen sekä käytännön sovelluksiin. Toivon mukaan tilaisuus voisi toimia lähtölaukauksena alan laajemmalle tutkimukselle ja soveltamiselle maassamme.

    Tilaisuus kaikille avoin ja maksuton. Tilaisuuteen voi osallistua sekä kuuntelijana että esitelmöitsijänä. Järjestelyjä ja kahvitusta varten pyytäisimme kaikkia osallistujia ilmoittautumaan sähköpostilla etukäteen Eero Hyvöselle viimeistään tiistaina 30.10.. Mikäli haluat ehdottaa tilaisuuden ohjelmaan omaa esitystä, kerro sen aihe ja anna lyhyt kuvaus esityksesi aiotusta sisällöstä perjantaihin 19.10. mennessä. Esitykset pidetään suomenkielellä.


    [Lisätietoja tapahtumasta]
    September 10, 2001
    Guest lecture on Monday, September 10, 2001 at 10-12 in Lecture Room A414:
    Roope Kaivola
    Intel

    "Proof Engineering in the Large: Formal Verification of Pentium 4 Floating-Point Divider"

    [See more information]
    August 27-31, 2001
    Guest lectures on August 27-31, 2001 at 10-12 in Lecture Room A217:
    Jorma Rissanen
    University of London, Technical University of Tampere

    Lectures on Statistical Modeling Theory

    June 4, 2001
    Guest lecture on Monday, June 4, 2001 at 14.15-15.00 in Lecture Room A414:
    Dr. Jens Lagergren
    KTH, Stockholm

    "Algorithms for Gene Trees and Species Trees"

    [See more information]
    May 4, 2001
    Guest lecture on Friday, May 4, 2001 at 9.00-16.00 in Lecture Room A414:
    Prof. Michael Wooldridge
    University of Liverpool

    The lecture discusses various topics in software agent technology. Detailed agenda will be informed later.

    [See more information]
    April 27, 2001
    Vierailuluento pe 27.4.2001 klo 12.15 salissa A414:
    FT Juha Nurmonen
    Helsingin yliopisto

    Laskennallisten ominaisuuksien deskriptiivisestä vaativuudesta.

    [Lisätietoja]
    March 23, 2001
    Vierailuluento pe 23.3. klo 10.15 - 12.00, A320:
    Prof. Roope Raisamo
    Tampereen yliopisto

    "Ohjelmistoagenttiteknologia käyttöliittymissä"

    March 9, 2001
    Guest lectures on March 9, 2001 at 10.15 in Room A 320:
    Dr. Michael Berger
    Siemens AG, Munich, Germany

    Mr. John Shepherdson
    British Telecommunications, Ipswich, UK

    Software Agent Technology at British Telecommunications and Siemens AG

    [See more information]
    March 8, 2001
    Symposium on Thursday, March 8, 2001 at 14.15-18.00 in Lecture Room A516:
    "ODP Symposium" - Open distributed processing talks
    [See more information]
    February 28, 2001
    Guest lecture on February 28, 2001 at 10.15 in Room A 319:
    Dmitri G. Korzoun
    University of Petrozavodsk

    Decomposition of Heterogenous Data Traffic onto Logical Sources: A Discrete Linear Approach

    [See more information]
    January 27, 2001
    Thesis defence on Saturday, January 27, 2001 at 10 (Auditorium III, Porthania):
    M.Sc. Juho Rousu
    University of Helsinki, Department of Computer Science

    "Efficient Range Partitioning in Classification Learning"

    [See more information]

    tiedottaja@cs.helsinki.fi