University of Helsinki Department of Computer Science

Department of Computer Science

Department information


Biomine search engine prototype

Try out the search engine at

More information + citation:

Biomine: Predicting links between biological entities using network models of heterogeneous database. Lauri MA Eronen and Hannu TT Toivonen. BMC Bioinformatics 13:119, 2012.

Biomine project: Knowledge discovery in biological databases

Public biological databases contain huge amounts of rich data, such as annotated sequences, proteins, domains, and orthology groups, genes and gene expressions, gene and protein interactions, scientific articles, and ontologies. The Biomine project develops methods for the analysis of such collections of data.

Example problem: candidate gene analysis

As a motivating problem, consider gene mapping. Mapping of a disease can result in tens or hundreds of candidate genes. The next problem is then to identify the most promising genes for further research. The current state of the art consists largely of manual exploration of public databases, for instance to find connections between genes and phenotypes. The Biomine project develops methods for automated discovery and prediction of previously unknown and potentially biologically relevant connections. Our focus is on candidate gene analysis, and methods we develop help geneticists assess the potential relationship of their candidate genes to the disease under study.

Research approach: graph mining

Example graph

In the Biomine approach, all information is handled as graphs: nodes correspond to different concepts (such as gene, protein, domain, phenotype, biological process, tissue), and semantically labelled edges connect related concepts (e.g., gene BCHE codes protein CHLE, which in turn has the molecular function 'beta-amyloid binding'; see a simple example graph). One central goal is to develop methods for establishing new, previously unknown connections between nodes, in other words, creation of biological hypotheses. We develop and use data mining algorithms for this. Predicted connections could be based, for instance, on discovered analogies between two concepts or their contexts, or on finding (strong) paths between concepts.

Applications for graph mining

Discovery of patterns in graphs have numerous potential applications in biology, including the analysis of metabolic networks, regulatory relationships, protein structures, and chemical compounds, as obvious candidates. Virtually any data could be described as graphs, and the developed methods can potentially be applied in other areas, too.

People and partners

Researchers in the project:

The project was carried out by the Discovery Group. in the HIIT Basic Research Unit at the Department of Computer Science, University of Helsinki. The project has been funded by

and it co-operated with


Research Group

The project is carried by the Discovery Research Group.


Contact: Prof. Hannu Toivonen, email

Up to: Department of Computer Science | HIIT