Discovery group: Data Mining for Pattern and Link Discovery



Department of Computer Science

Finnish Centre of Excellence for Algorithmic Data Analysis Research

Helsinki Institute for Information Technology HIIT


Research Projects

Computational Linguistic Creativity -- CLiC
In this project we investigate computational linguistic creativity, i.e., the ability of computers to act in verbally creative ways. Such creative skills will give computers more flexibility in their verbal communication with users. Software with creative skills can also be used to build tools that help people use language in novel and creative ways. We develop novel text mining inspired methods for computational linguistic creativity, especially for supporting human creativity, and we also investigate use of these methods as pedagogical tools in primary and secondary schools. The project combines computer science, data mining and computational creativity with pedagogy and use of digital technology in education. (Funding: Academy of Finland, 2014-2018)

Digital Language Typology -- DLT
Digital language typology (DLT) is a multi-disciplinary project intending to produce a computer-based platform that will be able to assess the structurally manifested family relationships within any set of languages with appropriate large digital textual and speech material. To this end, we have collected a group of specialists from phonetics, linguistics, and computer science. DLT is part of the Finnish Academy Digital Humanities programme, which includes novel methods and techniques in which digital technologies and state-of-the-art computational science methods are used for collecting, managing, and analysing data in humanities and social sciences research. The principal investigators are Martti Vainio (University of Helsinki, coordinator), Hannu Toivonen (University of Helsinki), and Markku Turunen (University of Tampere). (Funding: Academy of Finland, 2016-2019).

Immersive Automation -- IA
The demise of the old strategies of newspaper publishers has created an urgent need for radical transformation of operations. The aim of this project is to develop new strategies based on technical solutions that are evolving. We propose a holistic strategical approach, a new ecosystem for news that will open up for a new economically viable and technically sophisticated approach to news production and consumption. The research consortium is formed by The Swedish School of Social science, University of Helsinki together with the Department of Computer Science, University of Helsinki and VTT Technical Research Centre of Finland Ltd; the project involves collaboration with several Finnish media houses. (Funding: Tekes and companies, 2017-2018). 

Past projects

Concept Creation Technology (ConCreTe), Promotion the Scientific Exploration of Computational Creativity (PROSECCO)
Computational creativity is a new area of computer science, the goal of which is to model, simulate and enhance creativity. We capitalise on our data mining background by investigating the discovery and use of patterns in creative systems. Our current research topics include automatic production of creative texts, especially computational poetry and machine humor, and also music. We participate in ConCreTe: Concept Creation Technology project, and we are partners in the co-ordination action PROSECCO: Promoting the Scientific Exploration of Computational Creativity. (Funding: EU FP7, 2013-2016)

Biomine: A biological search engine
We view biological databases of sequences, proteins, genes etc. as weighted graphs and develop methods for link discovery and analysis in such graphs. Try out the prototype search engine at! We are also affiliated with InterPregGen: Genetic studies of pre-eclampsia in Central Asian & European populations (EU FP7, participation with Hannele Laivuori). (Funding: National Technology Agency (Tekes) and companies)

Data and text mining
We participate in two programmes of Tivit, The Strategic Centre for Science, Technology and Innovation in the Field of ICT. In the Next Media researh programme, we develop methods for on-line analysis and surveillance of social media for local news in the Software Newsroom project. In the Security Ecosystem of the Data to Intelligence researh programme, in turn, the goal of the consortium is to invent products, solutions and services that use security and related data sources to provide added value to the customers and good business for the providers. (Funding: Tekes)

Bison: Bisociation Networks for Creative Information Discovery
The aim is to develop and validate a novel computational methodology, which facilitates bisociative information discovery in large-scale heterogeneous information environments. (Funding: European Commission under the Framework 7 programme.)

Context: Context Recognition by User Situation Data Analysis
The Context project studies characterization and analysis of information about user's context and its use in proactive adaptivity. We have developed data analysis algorithms as well as ContextPhone, a mobile context-aware prototyping platform, available as free software. (Funding: Academy of Finland, PROACT Programme.)

Data mining in genetics
We develop models, methods and tools for analyzing genetic data, in particular for gene mapping and haplotype analysis. (Funding: National Technology Agency (Tekes) and companies in several projects, HIIT.)

22.12.2016 - 10:32 Hannu Toivonen
29.10.2010 - 17:33 Webmaster
29.10.2010 - 17:33 Webmaster