Next: A Personnel Up: Department of Computer Science: Previous: Resources

Subsections

Research Activities

The Department has three subprogrammes and five specialisation areas responsible for planning of the curricula and for administering teaching in their specialities. The division is not strict, and several research projects span two sections. The sections cover roughly the following subject areas:

Computer Science

Algorithms (Prof Esko Ukkonen, Prof Matti Mäkelä): algorithms and data structures, computational complexity, computational geometry, machine learning, computer graphics, numerical and symbolic computation, computational biology, geoinformatics, computationally intensive tasks.
Intelligent Systems (Prof Henry Tirri, Prof Eero Hyvönen): Bayesian networks, intelligent and adaptive systems, artificial intelligence, computational intelligence, artificial life, interval analysis, constraint reasoning.
Software Engineering (Prof Jukka Paakki, Prof A. Inkeri Verkamo): programming languages, compilers, software engineering, performance evaluation.
Distributed Systems and Data Communication (Prof Kimmo Raatikainen, Prof Timo Alanko, Prof emer. Martti Tienari): mobile computing, formal specification and verification, distributed systems, computer networks, operating systems.
Information Systems (Prof Hannu Erkiö, Prof Pekka Kilpeläinen, Prof Seppo Sippu, Prof Helena Ahonen-Myka): databases, human-computer interfaces, computer supported co-operative work, information system design methodology, design of databases, text databases, object-oriented databases, logic databases, database structures and algorithms, document management, data mining and knowledge discovery, text mining, information retrieval, management of spatial data (GIS).

Applied Computer Science

Applied Computer Science (Prof Esko Ukkonen): computational biology, geoinformatics, computationally intensive tasks.

Teacher in Computer Science

Teacher in Computer Science (Computer-supported education, Prof Pekka Kilpeläinen): computer-aided instruction, computers in education.

In the following, the research activities of each section of the Department are reviewed.

Algorithms

The main research areas are algorithms and data structures, machine learning, and computational biology.

Algorithms and data structures is the area with the longest tradition. The work on string matching algorithms (Ukkonen, Kärkkäinen) has been particularly successful. Theoretical work has often been conducted within the framework of systems research providing practical motivation for the problems studied. Currently, special emphasis is given to the research on algorithmic problems in computational biology and bioinformatics.

Work on machine learning (Elomaa, Kivinen, Ukkonen) is closely related to research of the Intelligent Systems section. On the algorithms side, the emphasis is on discrete methods and provable performance guarantees.

Projects

Algorithms on strings

The research on string matching algorithms started at the Department in the early 1980's. The initial impulse came from computer applications in molecular genetics. The group is now one of the leaders in its special field. It has obtained several basic results, for example, on edit distance, approximate string matching, DNA sequence assembly, suffix-trees and two-dimensional string matching, many of which have been included in recent international text books.

The current research topics include text indexing for fast retrieval by content as well as generalisation of one-dimensional string matching techniques to two and three-dimensional cases. The group has developed new small index structures for full text indexing which are smaller than suffix arrays. Also several fast filtering methods allowing translations and rotations have been developed for pattern matching in high dimensional strings. The novel feature of these methods is that they make use of the pixel or voxel structure in a very precise way.

The applications and motivations of this algorithmics research come from computational biology as well as from multimedia information retrieval. For example, specialised algorithms and their prototype implementation for search by content in music databases has been developed.

The members of the group are Prof Esko Ukkonen (group leader), Jorma Tarhio, docent, Kimmo Fredriksson, Juha Kärkkäinen, PhD, Kjell Lemström, and Sami Perttu. The group was supported in 1999 by the Academy of Finland and the Nokia Company.

K. Fredriksson and E. Ukkonen: Combinatorial methods for approximate image matching under translations and rotations. Pattern Recognition Letters 20, 11/13 (1999), 1249-1258.

J. Kärkkäinen: Repetition-Based Text Indexes. PhD Thesis, Report A-1999-4, Department of Computer Science, University of Helsinki, 1999.

K. Lemström and J. Tarhio: Searching monophonic patterns within polyphonic sources. In Proc. Content-Based Multimedia Information Access, RIAO '2000, vol. 2, 2000, 1261-1279.

Computational Biology

There are several subprojects in this area. The general goal is to develop efficient algorithmic solutions to different computational problems arising in molecular biology. The special emphasis is on applications of pattern matching, machine learning, and data mining methods. The work is done in close co-operation with application partners. The group also hosts a national graduate school in this area, the Graduate School on Computational Biology, Bioinformatics, and Biometry (ComBi).

With Kirsi Tappura, PhD, (State Technical Research Centre of Finland) the group studies modelling of loops in protein structure. With Prof Dennis Bamford (Institute of Biotechnology, University of Helsinki) the group develops computer tomography methods for constructing three-dimensional structures of macromolecular complexes such as viruses from electron microscopy data. As the first step, an efficient software package has been developed for isolating virus particles from electron micrographs. Another software system for handling and analysing such reconstructed virus models has also been accomplished. Joint work on the in silico analysis of whole genomes with Alvis Brazma, PhD, (European Bioinformatics Institute, Hinxton, UK) has been continued with the aim to correlate the putative regulatory patterns in the DNA sequence with the expression patterns of the genes. A new project on the integrated analysis of DNA expression, proteomics, and metabolia data is about to start (with Prof Hans Söderlund, Aristos Aristidou, docent; State Technical Research Centre). Moreover, in a separate project the group studies hydrological modelling in co-operation with the Finnish Environmental Centre.

The members of the group are Prof Esko Ukkonen (group leader), Markus Huttunen, Teemu Kivioja, Veli Mäkinen, Taneli Mielikäinen, Janne Ravantti, Satu Sihvo, Anatoly Verkhovsky, and Jaak Vilo (EBI, Hinxton). The group was supported in 1999 by the Academy of Finland.

A. Brazma, I. Jonassen, J. Vilo and E. Ukkonen: Predicting gene regulatory elements in silico on a genomic scale. Genome Research 8, 11 (1998), 1202-1215.

T. Kivioja, J. Ravantti, A. Verkhovsky, E. Ukkonen and D. Bamford: Local average intensity-based method for identifying spherical particles in electron micrographs. Journal of Structural Biology (in press).

J. Ravantti and D. Bamford: A data mining approach for analyzing density maps representing macromolecular structures. Journal of Structural Biology 125 (1999), 216-222.

Logic-Based Query Languages for Molecular Biology Databases

New application domains for database management systems demand that systems adapt to domain-specific data needs and not vice versa. In particular, databases of molecular biology data such as DNA strands encoded as strings must offer flexible tools for their manipulation. Thus we have extended the well-known relational database model to encompass strings as an independent data type, which permits user-defined string manipulation predicates within the query language. The extension draws its motivation from the concept of multiple sequence alignment employed in molecular biology, while its mathematical and computational aspects stem from temporal logic and automata theory, respectively.

The members of the group are Prof Esko Ukkonen, Matti Nykänen, PhD, Raul Hakli and Hellis Tamm. The project is supported by the Academy of Finland.

G. Grahne, M. Nykänen, E. Ukkonen: Reasoning about strings in databases. Journal of Computer and System Sciences 59, 1 (1999), 116-162.

R. Hakli, M. Nykänen, H. Tamm, E. Ukkonen: Implementing a declarative string query language with string restructuring. Proc. 1st International Workshop on Practical Aspects of Declarative Languages, PADL '99, Springer-Verlag 1999, 179-195.

Algorithmic Machine Learning

Machine learning is concerned with the question how to construct computer programs that automatically improve with experience. The research of the group aims at mathematically rigorous analysis and development of learning programs and understanding of the underlying theory of the learning process(es). The group approaches these topics from the empirical and theoretical viewpoints by analysing successful existing learning programs and further developing the computational theory of learning. The group has studied extensively the properties of evaluation functions that are used to partition numerical attribute domains in decision tree learning algorithms. In work with the University of California, Santa Cruz, the group has made theoretical analyses of various simple on-line learning algorithms.

The main researchers in the group are Tapio Elomaa, docent, and Jyrki Kivinen, docent. The group is a member of the Esprit Network of Excellence in Machine Learning and of the Esprit Working Group on Neural and Computational Learning Theory.

T. Elomaa and J. Rousu. General and efficient multisplitting of numerical attributes. Machine Learning 36, 3 (1999) 201-244.

D. P. Helmbold, J. Kivinen, and M. K. Warmuth. Relative loss bounds for single neurons. IEEE Transactions on Neural Networks 10, 6 (1999) 1291-1304.

Intelligent Systems

Work in this research area is focused on issues related to design and analysis of computational methods for adaptive and intelligent systems. Such methods include (among many other issues) topics such as probabilistic modelling, Bayesian networks, information-theoretic approaches to modelling (e.g., MDL and MML), (Bayesian) neural networks, case-based reasoning, evolutionary computation (e.g., genetic algorithms), search methods for high-dimensional spaces (e.g., simulated annealing), intelligent interfaces and information retrieval. The work is both theoretical and empirical in nature, aiming at practical algorithms solving complex and large scale Deep Computing type of modelling problems with computer systems combining ultrafast processing and sophisticated analysis methods.

Most of the research in this area is associated with the Complex Systems Computation research group (CoSCo, http://www.cs.helsinki.FI/research/cosco) led by professor Henry Tirri and docent Petri Myllymäki. Basic research work by the group is supported by the Academy of Finland, European Community, University of Helsinki, and various foundations. More applied work has been performed with support from the National Technology Agency TEKES and domestic and foreign industrial partners which include Alma Media, Nokia, TietoEnator, StoraEnso, KONE Corporation, ABB, WapIT, Space Systems Finland, ESA, NASA, and AT&T. Some of the resulting software has been adopted in the industry. Current members of the group are Jussi Lahtinen, Petri Kontkanen, Petri Nokelainen, Tomi Silander, Teemu Tonteri, Antti Tuominen, Pekka Uronen, Kimmo Valtonen and Hannes Wettig.

P. Myllymäki and H. Tirri: Bayes-verkkojen mahdollisuudet (Prospects of Bayesian Networks). Technology Report 58/98. National Technology Agency (Tekes), 1998.

P. Kontkanen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald: On Predictive Distributions and Bayesian Networks. Statistics and Computing 10 (2000), 39-54.

P. Kontkanen, J. Lahtinen, P. Myllymäki, and H. Tirri: Unsupervised Bayesian Visualization of High-Dimensional Data. In Proc. 6th International Conference on Knowledge Discovery and Data Mining, KDD' 2000 (eds. R. Ramakrishnan, S. Stolfo, R. Bayardo and I. Parsa.), ACM, 2000.

P. Kontkanen, P. Myllymäki, T. Silander, and H. Tirri: On Supervised Selection of Bayesian Networks. Proc. 15th International Conference on Uncertainty in Artificial Intelligence, UAI '99 (eds. K. Laskey, H. Prade), Morgan Kaufmann Publishers, 1999, 334-342.

P. Ruohotie, H. Tirri, P. Nokelainen and T. Silander: Modern modeling of Professional Growth. Research Center for Vocational Education, Saarijärven Offset 1999.

Projects

Computationally Efficient Methods for Deep Computing (DeepC)

Deep Computing is a term for methods solving complex and large-scale modelling and analysis problems with emerging computer systems that combine ultrafast processing with sophisticated analytical software. Deep Computing can be seen to consist of three intertwined research areas: 1. Deep modelling: prediction and data mining with very large data sets, 2. Deep optimisation: computationally efficient optimisation of complex multivariate cost functions, and 3. Deep view: interfaces for understanding high-dimensional data.

The methodological research objective of the project is to develop the theory and methods required for obtaining very large-scale computational, data and communications capabilities that can be used to solve grand challenge-level Deep Computing problems in business and science. The research focuses on stochastic approaches and is methodological and theoretical in nature, and aims at topics that can have a great impact in this area in the future.

The applied research objective of the project is to demonstrate solutions to previously intractable business and scientific problems by exploiting the advances in Deep Computing research in areas such as data modelling and analysis, high-end computing, search and optimisation algorithms, and high-dimensional visualisation. Such demonstrations will often be results of joint multi-disciplinary work together with scientists (scientific problems) as well as industrial partners (business problems).

The volume of the projects is FIM 1.8 million for a three year period starting in 2000. Funding is provided by the Academy of Finland.

Personalized Adaptive Interfaces (PAI)

The main objective of the PAI project is to develop methods for applying probabilistic modelling techniques, such as Bayesian network models, in building and using personalised, adaptive user interfaces. Specific research problems include user data segmentation, user profiling and user identification, and location-aware computing. The associated pilot-projects focus on problems related to intelligent educational technologies, adaptive WWW services and adaptive mobile services.

In user data segmentation the goal of the project is to develop computationally efficient methods for partitioning the available data into meaningful clusters by using adaptive Bayesian network modelling techniques. In user profiling these clusters are used for producing a semantic interpretation of the domain as a set of probabilistic user profiles. These profiles can be studied by using various data mining and visualisation techniques, and the results of this type of an analysis can be used off-line in designing personalised interfaces for WWW services or educational media. In user identification the produced probabilistic model is used for on-line identification of the user profile from partial and uncertain observations. This type of methods can be used for developing adaptive interfaces. In the area of location-aware computing, the project focuses on studying how the adaptive probabilistic modelling techniques can be used in estimating the location of a mobile user, based on the measurements provided by the mobile terminal.

The volume of the projects is FIM 1.3 million for a one year period starting in 2000. Funding is provided by the National Technology Agency and industrial partners.

Applications of Probabilistic modeling and Search Methods (PROMISE)

The PROMISE project focused on two research areas: probabilistic modelling and stochastic optimisation. In probabilistic modelling, the main goal of the project was to develop computationally efficient methods for building and applying probabilistic models, such as Bayesian networks and finite mixture models. In stochastic optimisation, the goal was to empirically study and compare different stochastic search methods, such as simulated annealing and genetic algorithms, in complex, highly constrained problem domains.

In probabilistic modelling, the research concentrated on theoretical and practical issues concerning model selection with respect to predictive performance of the chosen models. The methods developed in the project were validated by using proprietary real-world problems provided by the industrial partners, as well as publicly available benchmark problems available on the Internet. In the empirical tests performed, the group was able to show that even relatively simple Bayesian models in many cases yield better results than alternative techniques, if implemented in a theoretically correct manner. Moreover, the group was able to show, theoretically and empirically, that there are several ``urban legends'' concerning the Bayesian methodology for model selection, and that the well-known procedures commonly used in machine learning are in many cases based on misunderstandings or theoretically invalid arguments that lead to sub-optimal behaviour of the models. For some of these cases, the group was able to develop alternative model selection procedures which gave good results in the empirical results performed.

In stochastic optimisation, the group concentrated on empirical comparisons between different stochastic optimisation algorithms, such as simulated annealing and genetic algorithms. The empirical results demonstrated that, although it is possible to obtain consistently good results with genetic algorithms, similar performance was in many cases possible to achieve with much simpler and more efficient methods, such as different stochastic greedy algorithms. The group also developed a novel version of the celebrated simulated annealing algorithm. In this algorithm the difficult problem of parameter selection is solved by adjusting the so-called cooling schedule automatically during the optimisation process.

For the empirical part of the work, the group developed software that allows the researchers to use several dozens of Linux-workstations as a single ``virtual supercomputer'', which has made it possible to study interesting exponential-time problems. Some of the Bayesian modelling methods developed in the project were implemented in BAYDA, a JAVA software package for flexible data analysis in classification domains. BAYDA is available free of charge for research and teaching purposes from the group's homepage. The scientific results are reported in the over 20 international scientific publications produced during the project; copies of the articles can be downloaded from the group's home site.

The results of the project are commercially exploited in several fielded applications. StoraEnso is already widely using the intelligent container packing software Coptimi, implemented by TietoEnator, based on the optimisation algorithms developed in the project. Some of the optimisation methods developed by the group have also been integrated into fielded telecommunications software packages developed and used by Nokia. A commercial product development project, aiming at a data analysis software suite exploiting the probabilistic modelling methods developed in the project, is also currently in progress by one of the industrial partners.

The volume of the projects is FIM 2.65 million for a three year period starting in 1998. Funding is provided by the National Technology Agency and industrial partners.

Computational Intelligence Techniques for Nonlinear modelling in Social Sciences (NONE)

The general objective of this research is to develop theoretically sound computational intelligence techniques for nonlinear modelling of data, and methodologies for applying them in the domain of educational data. The research has both a strong basic research component, i.e., the development of theoretically sound probabilistic nonlinear methods, and an applied methodological component where the nonlinear techniques are applied to modelling and analysis in the educational domain.

In traditional quantitative approaches in (vocational) education, modelling is implemented either by exploratory multivariate analysis (typically factor analysis) or, if a model structure can be derived from the theory, by a confirmatory analysis using for example LISREL models. Missing data is handled by omitting the corresponding data vectors or by averaging. Prediction is typically performed with linear regression models or, in the case of comparative analysis, by discriminant analysis. Theoretical analysis of the decision-making phase is typically ignored.

In this research the nonlinear alternatives to the above linear methods are studied: probabilistic models for exploratory analysis (with a Bayesian approach for estimating missing data), Bayesian networks for confirmatory analysis, and probabilistic predictive models for regression and discriminant analysis. In addition the little known relationship of the theoretical foundations of computational intelligence techniques and Bayesian model selection in social science is investigated.

The volume of the projects is FIM 0.6 million for a three year period starting in 1996. Funding was provided by the Academy of Finland.

Software Engineering

Software is the key element in the modern systems of the contemporary information society. The flexibility and services offered by our technical facilities are mostly implemented as software solutions on software technology. Hence software quality is the main factor of the services and systems in use.

Software quality is the general research topic of the software engineering group at the Department. The emphasis is on the technical aspects of software quality (rather than on process-centric aspects), and more precisely on the quality of the software design. The design phase of a software system, documented by such technical artifacts as software architecture, design patterns, and reference framework, is the most natural quality assurance phase in the system life cycle since it acts as a bridge between informal and imprecise user requirements and formal and precise implementation (code) of those requirements. When captured at the software design phase, potential quality problems can be solved before introducing them into the actual system.

The research carried out by the group has close contacts with the Finnish software industry. A visible example of the close cooperation is that the leader of the group works half-time in the software technology laboratory of the Nokia Research Center.

Projects

Techniques for designing and using OO frameworks (FRED)

Design quality is studied by the research group in several forms. Proactive quality assurance methods are the focus in the FRED project which develops a CASE tool for application engineering based on design patterns and application frameworks. The tool helps a software (framework) designer make correct and mature decisions by assisting her in the design process and by checking that the quality restrictions embedded in the applied design patterns are not violated.

FRED is a joint research project with the Tampere University of Technology and the University of Tampere. The researchers involved are Jukka Paakki, Antti Viljamaa, and Jukka Viljamaa. A prototype of the FRED tool has been released. The volume of the project is 6 manpower years (1997-99), and FIM 1.5 million, funded by the National Technology Agency and companies.

M. Hakala, J. Hautamäki, J. Tuomi, A. Viljamaa, J. Viljamaa, K. Koskimies, and J. Paakki: Managing Object-Oriented Frameworks with Specialization Templates. Proc. International Workshop on Object Technology for Product-Line Architectures, ECOOP '99 , European Software Institute, 1999, 87-97.

M. Hakala, J. Hautamäki, J. Tuomi, A. Viljamaa, J. Viljamaa, K. Koskimies, and J. Paakki: Task-Driven Framework Specialization. Proc. Fenno-Ugric Symposium on Software Technology, FUSST '99 (ed. J. Penjam), Technical Report 104/99, Institute of Cybernetics, Tallinn Technical University, 1999, 65-74.

Metrics for Analysis and Improvement of Software Architectures (MAISA)

The MAISA project develops both reactive quality assurance methods for software design and associated proactive methods for software implementation. The central idea is to measure the quality of the software architecture, and use the results for predicting quality aspects, such as size, complexity and performance, of the actual system based on the architecture.

The researchers involved are Juha Gustafsson, Lilli Nenonen, Jukka Paakki, and Inkeri Verkamo. A prototype of the MAISA metrics tool has been released. The volume of the project is 3 manpower years (1999-2000) and FIM 1 million, funded by the National Technology Agency and companies.

J. Paakki, A. Karhinen, J. Gustafsson, L. Nenonen, and A. I. Verkamo: Software Metrics by Architectural Pattern Mining. To appear in Proc. IFIP World Computer Congress, WCC 2000, Beijing, China, 2000.

L. Nenonen, J. Gustafsson, J. Paakki, and A. I. Verkamo: Measuring Object-Oriented Software Architectures from UML Diagrams. To appear in Proc. 4th International Workshop on Quantitative Approaches in Object-Oriented Software Engineering, QAOOSE '2000 (in association with ECOOP '2000).

A. I. Verkamo, J. Gustafsson, L. Nenonen, and J. Paakki: Design Patterns in Performance Prediction. Proc. 2nd International Workshop on Software and Performance, WOSP 2000, ACM (SIGMETRICS and SIGSOFT), ACM Press, 2000, 143-144.

Object-oriented software architectures (SAARA)

The SAARA project studies after-the-fact quality assurance by reverse-engineering the software code into higher-level representations. For instance, when recovering the software architecture from its code one can conclude whether the design decisions have been followed in the implementation phase and verify that the architecture has not been decayed.

The researcher involved is Jukka Paakki. The volume of the project is 3 manpower years (1999-2001), and FIM 0.6 million, funded by the Academy of Finland.

Distributed Systems and Data Communication

The specialisation area of software systems was recently divided into two, separating out software engineering from distributed systems and data communication. However, the distributed systems and data communication group is still rather large and covers four areas of research:

Modelling of concurrent systems studies formal specification and verification of distributed systems. The theoretical results of the group are based on process algebras and temporal logic. The results are applied to software engineering tools. The results of the MOCO project are notable and well-received by the international research community.
Mobile data communication systems develops wireless data communication systems and applications. Basic requirements for such systems include real-time processing, capability for multimedia transfer, applicability to changing conditions on data transfer and adaptability to heterogeneous environments. The constructive results of the national MOWGLI and MONADS projects and the EC funded project MONTAGE have been well noted by Finnish companies and have been used for international standardisation in the area.
The open distributed computing group (ODCE) investigates and develops software architectures and services for global systems. Openness in this context requires capabilities for automatic negotiation of new cooperation networks amongst independently developed and autonomously managed systems. The results of this research are both conceptual and constructive (services, software engineering tools), and have been well received in the international standardisation of this area. The most recent project in this area is CORBA-FORTE.
The real-time systems group covers fast data storages and transfer. On real-time databases optimistic strategies, memory-based storage, and replication-based fault-tolerance are under experimentation in the RODAIN project. Fast data transfer development is taking place in the HPGIN project.

The goals of the group are twofold: On one hand, tools are developed for analysing and modelling systems based on theory. On the other hand, more powerful services are developed for the application platform based on operating systems, data communication, distribution algorithms and effective information management solutions.

The projects in this group combine modelling and constructive approaches, and have good relationships with industrial sector. Companies like Nokia and Sonera are frequent participants in the projects of this group. The National Technology Agency and the European Commission are the major sources of funding for this research.

The group has good international relationships through EC projects, active involvement in standardisation work within ISO, IETF and OMG, and exchange of researchers. The group also organised the Second International Working Conference on Distributed Applications and Interoperable Systems, DAIS '99, in 1999.

Projects

Modelling of Concurrency (MOCO)

Recently, research on the theoretical aspects of concurrency has concentrated on three areas. The theory of partially defined specifications and their refinement relations has been transferred from the bisimulation semantics to the decorated trace semantics. Secondly, the modelling and verification of timed systems has been studied. Thirdly, liveness verification with software tools has been under investigation. Besides these theoretical studies, an experimental work on specification-related algorithms has been going on. One example of this work is the determinisation of automata.

The research group consists of Prof Emer Martti Tienari, Roope Kaivola, docent, and postgraduate students Timo Karvi, Päivi Kuuppelomäki and Matti Luukkainen.

Mobile Office Workstations using GSM Links (MOWGLI)

The goal of the MOWGLI project is to study, design, and test a data communication architecture for a wireless WAN, like the pan-European GSM-based mobile data service, and to develop a prototype based on that architecture. The environment of an application consists of mobile PC's that can be connected over a wireless WAN connection, for example through a mobile phone, to any part of a fixed data communication network. The work in the project concentrates on the architectural aspects that support the mobility of the client, allow client applications to operate in a disconnected or in a weakly connected mode, and hide the problems of the wireless connection. The work in the project has recently been enlarged to implement performance measurement tools for wireless networking and to contribute to international network standardisation.

The volume of the project in 1999 was 20 manpower months and FIM 1.13 million, funded by the National Technology Agency and companies. The researchers involved are Prof Timo Alanko, Prof Kimmo Raatikainen, and Markku Kojo.

T. Alanko, M. Kojo, M. Liljeberg, and K. Raatikainen: Mobile Access to the Internet: A Mediator-based Solution. Internet Research 9, 1 (1999), 58-65.

Adaptation Agents for Nomadic Users (MONADS)

http://www.cs.helsinki.fi/research/monads/

The MONADS project studies the use of adaptive agents for serving mobile users in a wireless environment. The application functionality is improved by learning agent technology and mobility adaptive protocols between the agents. Mobility aware protocols are adopted from the Mowgli data communication architecture.

The MONADS project has developed an agent-based software architecture and a prototype implementation. The prototype uses learning agents for predicting the quality of wireless links and adapts the behaviour of a WWW browser according to the prediction. The WWW agent controlling the adaptation can choose between various compression methods or even decide against transmitting large or unimportant pictures. In addition, the project has created software for more efficient JavaRMI for optimising agent communication.

The volume of the project in 1999 was 32 manpower months and the researchers involved are Prof Kimmo Raatikainen, Oskari Koskimies, Stefano Campadello, Heikki Helin, and Pauli Misikangas.

S. Campadello, H. Helin, O. Koskimies, P. Misikangas, M. Mäkelä and K. Raatikainen: Using Mobile and Intelligent Agents to Support Nomadic Users. Proc. 6th International Conference on Intelligence in Networks, ICIN '2000, 2000.

S. Campadello, H. Helin, O. Koskimies and K. Raatikainen: Performance Enhancing Proxies for Java2 RMI over Slow Wireless Links. Proc. 2nd International Conference and Exhibition on the Practical Application of Java, PAJAVA '2000, 2000.

K. Raatikainen, L. Hippeläinen, H. Laamanen and M. Turunen: Monads - Adaptation Agents for Nomadic Users. Proc. World Telecom '99, 1999.

MObile INTelligent AGEnts in Accounting, Charging and Personal Mobility Support (MONTAGE)

The MONTAGE project studies, evaluates and assesses the impact of agent technology to the telecommunications world. Agent technology is used to support efficient (in terms of both cost and performance) service provision to fixed and mobile users in competitive telecommunications environments.

This EC funded project contributes a TINA-compliant architecture and implementation of support services for personal mobility, accounting and charging services. The project has a strong design and implementation flavour, combined with theoretical guidance and feedback from all involved stakeholders by means of trials. The volume of the project is 22 manpower months and 0.12 million euro, and the researchers involved are Prof Kimmo Raatikainen and Stefano Campadello.

Performance and usability of the CORBA architecture in telecommunications technology (CORBA-FORTE)

The CORBA-FORTE project produced a set of tools and techniques for improving the performance and usability of CORBA-based distributed systems. The results of the project were validated against a realistic information system in a CASE study.

Of particular importance for the project are performance analysis and modelling, software performance engineering (SPE), active participation in OMG work, and implementation of a performance engineering framework. The volume of the project in 1999 was 2 manpower years and FIM 0.6 million. The main researcher involved were Prof Emer Martti Tienari, Prof Kimmo Raatikainen and Pekka Kähkipuro, PhLic.

P. Kähkipuro: The Method of Decomposition for Analyzing Queuing Networks with Simultaneous Resource Possessions. Proc. Communications Networks and Distributed Systems modelling and Simulation Conference, CNDS '99, (eds. R. Simon, T. Znati), The Society for Computer Simulations International, 1999, 165-172.

P. Kähkipuro: UML Based Performance modelling Framework for Object-Oriented Distributed Systems, Proc. $\ll$ UML $\gg$ '99 -- The Unified modelling Language, Beyond the Standard (eds. R. France, B. Rumpe), Lecture Notes in Computer Science No. 1723, Springer-Verlag, 1999, 356-371.

Open Distributed Computing Environments (ODCE)

The ODCE group studies interoperability and federation of autonomous components in global networked environments. Standardisation work on ODP (Open Distributed Processing) in ISO with liaisons to OMG is actively participated. The group also organised the Second International IFIP Working Conference on Distributed Applications and Interoperable Systems, DAIS '99, in June. The main researcher involved is Prof Lea Kutvonen.

L. Kutvonen: Sovereign systems and dynamic federations. Proc. 2nd International IFIP Working Conference on Distributed Applications and Interoperable Systems, DAIS '99, Kluwer Academic Publishers, 1999, 77-90.

L. Kutvonen, H. Koenig, M. Tienari (eds.): Proceedings of the Second International IFIP Working Conference on Distributed Applications and Interoperable Systems, DAIS '99, Kluwer Academic Publishers, June 1999.

Real-Time Object-Based Database Architecture for Intelligent Networks (RODAIN)

The RODAIN project series develops a real-time database based on object database standards. The database requirements are induced by telecommunication and virtual private networks. The group has produced a prototype of a database characterised by the terms real-time, object-oriented, main-memory based and distributed. The volume of the project in 1999 was 36 manpower months and FIM 0.58 million. The researchers involved were Prof Kimmo Raatikainen, Tiina Niklander, Pasi Porkka, Jan Lindström, and Juha Taina.

J. Lindström, T. Niklander, P. Porkka, K. Raatikainen: A Distributed Real-Time Main-Memory Database for Telecommunication. Proc. Workshop on Databases in Telecommunication, co-located with VLDB, 1999, 11.

J. Lindström, K. Raatikainen: Dynamic Adjustment of Serialization Order using Timestamp Intervals in Real-Time Databases. Proc. 6th International Conference on Real-Time Computing Systems and Applications, RTCSA '99, 1999, 13-20.

High Performance Gigabit I2O Networking Software (HPGIN)

I2O (Intelligent I/O) is an essential hardware and software vendor specification for data transfer using a separate IOP processor. The HPGIN project implements I2O software for fast LAN equipment. This joint EC-funded project with SysKonnect (Germany) and XPlab (Italy) produces a new Gigabit Ethernet card and supporting software based on I2O, and I20 message passing and resource control and an I2O LAN driver for Linux and X-Polypus environments.

The Linux drivers and services are developed at the University of Helsinki, and are available in all Linux distribution versions. The kernel is available at http://www.kernel.org/. Also, an http-based configuration utility for management of software and parameters is available at http://www.cs.helsinki.fi/group/hpgin/.

The volume of the project is 30 manpower months and 0.17 million euro, and the researchers involved are Prof Kimmo Raatikainen, Prof Emer Martti Tienari, Auvo Häkkinen, and Juha Sievänen.

A. Häkkinen, J. Sievänen: Specification of the Software Package D (HPGIN-Linux). Report UHEL.03.99.01-DR-D1, Department of Computer Science, University of Helsinki, March 1999.

A. Häkkinen, J. Sievänen: HPGIN-Linux Test Specification Plan. Report UHEL.10.99.01-DR-E1, Department of Computer Science, University of Helsinki, October 1999.

Promoting Interoperability for Multimedia services in Europe (PRIME)

The PRIME project is oriented at the provision of support for the development and implementation of interoperable multimedia services in Europe. Specifically the project has been looking at the requirements for achieving interoperability for multimedia services over alternative delivery platforms.

The key objective and contribution of PRIME to the ACTS programme is to improve the cross fertilisation between ACTS projects involved in interactive multimedia communications on the one hand, and interoperability initiatives in standardisation bodies and fora in the same area on the other. Among others, this will result in increased awareness on interoperability issues and opportunities in ACTS.

The volume of the project is 40,000 euro and the main researcher involved is Prof Kimmo Raatikainen.

Optimizing TCP for Wireless Links (IWTCP)

The standardisation body for the Internet protocols, the Internet Engineering Task Force (IETF), is specifying various performance enhancements to TCP and is documenting the impact of problematic link-layer characteristics to the Internet protocols. In addition, mitigations to the performance implications of problematic link characteristics and approaches to enhance the performance that the Internet protocols, particularly TCP, attain in the face of particular link characteristics are being documented.

The objective of the IWTCP project is to measure the TCP performance implications of those link characteristics that are typical for wireless links. In addition, some experimental TCP performance enhancements for particular link characteristics are implemented and the impact of the enhancements is analysed. The volume of the project is 8.4 manpower months and FIM 0.18 million, and the researchers involved are Prof Kimmo Raatikainen and Markku Kojo.

Information Systems

The specialisation area of information systems gives the basic education in the areas of databases and information management for all the computer science students. However, the research in the area is not focused on the most traditional information management problems, but on areas that have mostly emerged only a few years ago. These subfields include data mining, document management, workflow management, and user interface design patterns.

Data mining (or knowledge discovery in databases) develops methods and systems for extracting interesting and useful information from large sets of data. Data mining methods developed in our group can be used in a variety of application areas, such as commercial databases, telecommunication alarm sequences, epidemiological data, etc. The area combines techniques from databases, statistics, and machine learning. The recent focus areas are inductive databases in the knowledge discovery process and applying data mining to textual data.

The accelerating development of the World Wide Web has made numerous digital document collections widely available for the public. There is a clear need for new document management tools that assist the user to gather, combine, and reuse information from existing document collections. Moreover, the amount of fine-structured documents will increase enormously in the near future, since the Extensible Markup Language (XML) is rapidly gaining popularity in various communities. Compared to HTML, XML makes more versatile processing and customisation of documents possible. However, explicit structuring using XML leads to heterogeneously structured document collections, which causes problems when combining and reusing fragments of documents. This is the central starting point in our document management research.

Workflows model business processes that consist of related tasks. The tasks can be automatic, semi-automatic, or manual. A workflow system is software that can be used to define workflows and to automatize the coordination of processing the tasks. The research in this group concentrates on methods of coordinating the tasks by transactional workflows.

User interface design is an often neglected area in software engineering. Moreover, it is often seen as specific to a software at hand, which leads to a development of solutions almost separately for each case. There are, however, user interface problems, like query interfaces or the specification of time intervals, which recur very frequently. An emerging area of user interface design, and also the focus area of our user interface group, is using pattern languages for interaction design.

Projects

Data mining

The major areas of the data mining research have been the development of fast algorithms for association rule and episode rule discovery, and the application of the algorithms for finding regularities in sequential data, particularly in telecommunications alarm data. Moreover, methods for finding interesting knowledge from potentially large result sets returned by the discovery have been developed. Recently, the research has focused on four areas: inductive databases in the knowledge discovery process, measuring similarities of events and event types in sequences, biological and epidemiological applications, and knowledge discovery in text. The researchers involved are Prof Helena Ahonen-Myka, Oskari Heinonen, Mika Klemettinen, PhD, Pirjo Moen, PhD, Marko Salmenkivi, and Prof A. Inkeri Verkamo.

J.-F. Boulicaut, M. Klemettinen, and H. Mannila: modelling KDD processes within the inductive database framework. Proc. 1st International Conference on Data warehousing and Knowledge Discovery, DaWaK '99 (eds. M. K. Mohania and A. M. Tjoa), Lecture Notes in Computer Science, Vol. 1676, Springer 1999, 293-302.

M. Klemettinen, H. Mannila, and H. Toivonen: Rule Discovery in Telecommunication Alarm Data. Journal of Network and Systems Management 7, 4 (December 1999), Plenum Press.

H. Mannila, and P. Moen: Similarity between event types in sequences. Proc. 1st International conference on Data warehousing and knowledge discovery, DaWaK '99, (eds. M. K. Mohania and A. M. Tjoa), Lecture Notes in Computer Science, Vol. 1676, Springer 1999, 1999, 271-280.

M. Klemettinen. A Knowledge Discovery Methodology for Telecommunication Network Alarm Databases PhD Thesis, Report A-1999-1, Department of Computer Science, University of Helsinki, January 1999.

H. Ahonen-Myka, O. Heinonen, M. Klemettinen, and A. I. Verkamo: Finding co-occurring text phrases by combining sequence and frequent set discovery. Proc. 16th International Joint Conference on Artificial Intelligence, IJCAI '99: Workshop on Text Mining: Foundations, Techniques and Applications, (ed. R. Feldman), 1999, 1-9.

Document management (DocMan)

The main focus of the document management group has been document assembly of structured (e.g. XML) documents. Document assembly is computer-aided construction of new documents from existing document collections. Such reuse includes finding relevant document fragments, modifying them as needed, and combining the fragments. If the assembled documents are to be further processed, the heterogeneous structures of the original documents also have to be unified.

The PhD thesis work of Barbara Heikkinen presents an element-type classification method, which contains a decision procedure for mapping an arbitrary structure element to a predefined generic class that describes some typical logical structure of electronic documents. The semantics of the generic classes can be utilised in unified processing (e.g. printing) of arbitrary structures.

In his PhD thesis work, Oskari Heinonen studies the problem of intelligent document fragmentation: how to find in a text self-contained multi-paragraph fragments that can be used as components in the assembly process. Fragmentation is basically a problem of choosing the paragraph boundaries that make the best fragment boundaries. To get convenient-sized fragments, paragraph similarity information (based on lexical cohesion) alone is not enough; the lengths of the created fragments also have to be considered. Our fragmentation method is based on dynamic programming and is guaranteed to give an optimal solution with respect to the input and the parameters.

Besides document assembly, the group has also developed several tools for structured document management, including the search and transformation languages Sgrep and TranSID. The researchers involved are Prof Helena Ahonen-Myka, Barbara Heikkinen, PhD, Oskari Heinonen, Mika Klemettinen, PhD, Pekka Kilpeläinen, PhD, Greger Linden, PhD, and Jani Jaakkola.

H. Ahonen-Myka, B. Heikkinen, O. Heinonen, M. Klemettinen: New tools for a knowledge worker. Proc. XML Finland '99, SGML Users Group Finland, 1999, 25-32.

B. Heikkinen: Generalization of Document Structures and Document Assembly. PhD Thesis, Report A-2000-2, Department of Computer Science, University of Helsinki, April 2000.

P. Kilpeläinen, and J. Jaakkola: Nested text-region algebra. Report No. C-1999-2, Department of Computer Science, University of Helsinki 1999.

J. Jaakkola, P. Kilpeläinen, G. Lindén, J. Niemi, and K. Paasiala: TranSID: an SGML document manipulation language -- reference manual. Report No. C-1999-35, Department of Computer Science, University of Helsinki 1999.

Workflow management (WorkMan)

The automation of recurring units of tasks in the workplace is attempted with workflow systems. The processing of insurance claims or loan applications, for example, could be task units, i.e. workflow, that could be automated. Workflow systems have two purposes: they enable the specification of workflows, and they take care that each workflow task progresses. Workflow systems usually enable the specification of workflow with the help of graphical user interfaces. The specification presents the tasks that are to be carried out and their mutual dependencies, as well as possible transactional demands. The different tasks included in the workflow may be carried out in different locations. Therefore, the development of workflow systems touches on several disciplines such as combining distributed heterogeneous systems and transaction research. Research in the field of workflow systems is brisk at the moment. In the near future, we can expect commercial versions of several systems that are at an experimental level as yet.

The WorkMan project implements a prototype of a workflow system. With the help of the prototype, we are hoping to ensure a reliable execution of workflow through transactional workflows. In implementing the system, we are striving to use the services offered by database systems based on the SQL language as much as possible.

The WorkMan system has been implemented with the help of students as an exercise for the course Software Engineering Project. The research connected with the project has been carried out by Harri Laine, PhLic, and Juha Puustjärvi, PhD.

H. Laine, J. Puustjärvi: modelling Business Processes as Transactional Workflows. Proc. Workshop on Practical Business modelling (in conjunction with CAiSE'00), 2000.

J. Puustjärvi, H. Laine: WorkMan -- a Transactional Workflow Prototype. Proc. DEXA '00, 2000.

J. Puustjärvi: Transactional workflows. PhD Thesis, Report A-1999-2, Department of Computer Science, University of Helsinki, 1999.

Graphical interface solutions and techniques (GIST)

Many of the current software user interfaces force the user to do unnecessary work and waste his time. Good user interface solutions should be available in such a form that they could be implemented with minimal user interface design skills, quickly and cheaply. We have discovered that the same user interface problems, e.g. query formulation, visualisation and management of hierarchies, and managing complex time intervals, tend to recur in various systems and contexts. These findings have led us to develop a collection of user interface design patterns to be used as a tool and learning aid for user interface designers. Our collection currently includes 25 patterns and pattern candidates. The researchers involved are Sari A. Laakso, Karri-Pekka Laakso, and Asko Saura.

K.-P. Laakso, A. Saura, S. A. Laakso: Position paper at the CHI 2000 Design Pattern Workshop.

Applied Computer Science

The subprogramme Applied Computer Science is aimed for students who want to specialise in some application area and study it more than in the other subprogrammes. Every student has an individual study programme. The research activities are pursued in several of the other research divisions, e.g. within the algorithms, machine learning, biocomputing, and data mining groups.

Teacher in Computer Science

Researchers associated with the Teacher in Computer Science line of specialisation have pursued research on topics in the borderland between computer science and education. Computer uses in education and visualisation have been the major areas of interest. In 1998 - 1999 the scope was extended to the use of information technology in social and human services. In general, the volume of Teacher in Computer Science related research is rather modest when compared to the main research areas of the Department.

Projects

Animation Aided Problem Solving (AAPS)

Animation is a standard technique in computer-aided instruction. A project called Animation Aided Problem Solving (AAPS) aimed at applying the methods of algorithm animation in problem solving, and developed systems for fast generation of algorithm animations. Members of the AAPS group are Jorma Tarhio, docent (group leader), Prof Veijo Meisalo (Department of Teacher Education), Erkki Sutinen, PhD, Jaakko Kurhila, PhLic, Matti Lattu, MSc, Erkki Rautama, and Tommi Teräsvirta. The funding of AAPS from the Ministry of Education and from the University of Helsinki finished by the end of 1998, but the project still continued in 1999.

K. Järvinen, T. Pienimäki, J. Kyaruzi, E. Sutinen, T. Teräsvirta: Between Tanzania and Finland: learning Java over the Web. Proc. 13th SIGCSE Technical Symposium on Computer Science Education, SIGCSE '99, ACM, 1999, 217-221.

J. Kurhila, E. Sutinen, J. Tarhio: Towards meaningful computer uses in education. Proc. Information Technology Shaping European Universities, EUNIS '99 (ed. K. Sarlin), Espoo, Finland, 1999, 261-264.

Survey of Information Technology in Social and Human Services in Finland (SosKart)

A survey of information technology in social and human services was conducted during 1998 and 1999 by Jaakko Kurhila, PhLic, Perttu Iso-Markku (student of computer science) and research assistant Saara Maalismaa (student of Social Sciences). Prof Matti Mäkelä was in charge of the project.

The aim of the first stage of the survey was to draw an overall picture of the use of IT in public and private social administration and services, and to see how it supports the policy of the Finnish information society. In the second stage, some representative examples of IT solutions were investigated more closely. The aim was to survey and evaluate the systems and software used, analyse the usability aspects and gather user experiences as well as to find out possible future directions.

The survey took 16 man months, and it is expected to lead to the MSc thesis of Mr Iso-Markku. The funding of FIM 315,000 came from OSVE (Network of Excellence Centres for Social Welfare and Health Care) through STAKES (National Research and Development Centre for Welfare and Health).

P. Iso-Markku, J. Kurhila: Sosiaalialan tietotekniikkakartoitus (Survey of information technology in social and human services in Finland, in Finnish). Publications of the National Network of Excellence Centres 1, Helsinki 1999.

The University of Helsinki also provided a funding of FIM 66,000 for the six-month research of Timo Muhonen (student of CS) on the topic Internet luentosalina (Internet as a lecture room). The leader in charge of the project was Erkki Sutinen, PhD. The project considered and developed tools for easy recording of in-class lectures and for their easy delivery on the Internet.

Next: A Personnel Up: Department of Computer Science: Previous: Resources