Algodan
Algodan > Software Library

Algodan software library is a collection of software made by Algodan researchers. See individual software pages for contact information and details on licensing.

If you are an Algodan researcher and would like to see your software included here, please contact Janne Korhonen.

Software

BernoulliMix

BernoulliMix program package provides tools to work with finite mixture models of multivariate Bernoulli distributions, also known as Bernoulli mixtures. The program package can be used for probabilistic modeling of 0-1 data. The target audience includes researchers, teachers, and students in machine learning and data mining.

Biomine - A biological search engine

We view biological databases of sequences, proteins, genes etc. as a weighted graph and develop methods for information search and discovery in such graphs.

EEL - Enhancer Element Locator

Enhancer Element Locator, or EEL, is a tool for locating distal gene enhancer elements in mammalian genomes by comparative genomics.

FiD - Fragment iDentificator

Fragment iDentificator (FiD) is a windows applications for identification of molecular fragments from tandem mass spectrometry data. FiD is aimed at mass spectrometrists and chemist to assist in interpreting and analysing ms/ms spectra. FiD exhaustively lists suitable fragment structures for each measured mass-to-ratio peak, and also uses mixed integer linear programming techniques to suggest the whole fragmentation pattern, i.e. the set of fragments which explain the whole spectra with minimal number of bond changes.

InvCoal - a coalescent simulator

InvCoal is a coalescent simulator for generating synthetic SNP data sets with a simulated inversion. It also uses a multiple crossover model with a chiasma interference model for the modelling of gene flow between inverted and noninverted haplotypes.

Maplab

This Java application performs intelligent placement of line numbers on a public transit map. It loads Google Transit data, adds route numbers, and produces an overlay on Google Maps.

The details appear in the paper

MOODS - Motif Occurrence Detection Suite

MOODS is a suite of algorithms for matching position weight matrices (PWM) against DNA sequences. It features advanced matrix matching algorithms implemented in C++ that can be used to scan hundreds of matrices against chromosome-sized sequences in few seconds.

readaligner

A tool for mapping (short) DNA reads into reference sequences. It consists of algorithms based on Burrows-Wheeler transform and backward backtracking. It also includes a novel data structure called the rotation index that finds alignments having higher number of mismatches in feasible time (at the cost of a larger index and fixed pattern length).

ReMatch

ReMatch is a web-based tool for integration of user-given stoichiometric metabolic models into a database collected from public data sources such as KEGG, MetaCyc, CheBI and ARM. ReMatch is geared particularly towards 13C metabolic flux analysis: it is possible to augment the model with carbon mappings and export the model to analysis in 13C flux analysis software.

ReTrace

ReTrace is a computational method for inferring branching pathways in genome-scale metabolic networks.

RLCSA - Run-Length Compressed Suffix Array

The RLCSA is a compressed suffix array implementation that has been optimized for highly repetitive text collections. Examples of such collections include version control data and individual genomes. This implementation can also be used to construct the Burrows-Wheeler transform of a collection of texts space-efficiently.

Sinuhe - Statistical Machine Translation tool

Sinuhe is a Statistical Machine Translation tool developed by Dr. Matti Kääriäinen. Its main characteristics are a conditional exponential family translation model utilizing parallel machine learning and a very fast decoder making it well suited for online information retrieval. Sinuhe is the default SMT engine in the SMART Search Engine. Sinuhe is freely available for download under GPL.

Internals of Sinuhe are described in the following paper:

SMART Search Engine

SMART Search Engine is a web-based demonstrator for searching the Wikipedia in one language using queries in another, and translating relevant pages on-the-fly back into the query language. The search engine was developed by Algodan Machine Learning team and the HIIT/CosCo group as part of EU FP6 STREP Statistical Multilingual Analysis for Retrieval and Translation. It integrates several cross-lingual information retrieval engines (CLIR) with a statistical machine translation (SMT) tools.

SuDS project cst - compressed suffix tree implementation

Our implementation of compressed suffix trees (Sadakane, 2007) supports all typical suffix tree operations, including suffix links and lowest common ancestor queries, and requires less memory than a plain suffix array.