University homepage Suomenkielinen versio puuttuu Inte på svenska In english
University of Helsinki Department of Computer Science
 

Department of Computer Science

Software

Compressed Data Structures

NEW: A compressed suffix tree implementation can be found at the project page.

Below my compressed data structure implementations. Some of them are now plugged into Pizza & Chili Corpus, that contains library implementations of several compressed text indexes and testbed data collections.

  • Implementation of the Compact Suffix Array index structure (CPM 2000 & ALENEX 2001): Download csa.zip and see README inside the package for instructions. The package contains C++ source code for constructing and querying compact suffix arrays.
  • Implementation of the Compressed Compact Suffix Array index structure (CPM 2004): Download ccsa.zip and see README inside the package for instructions. The package contains C++ source code for constructing and querying compressed compact suffix arrays.
  • Implementation of a Huffman-FM index structure (SPIRE 2004). Download hufffm.zip and see README inside the package for instructions. The package contains C++ source code for constructing and querying a version of FM-index where Huffman-compression is first applied to the text.
  • Implementation of a RLFM index structure (CPM 2005): Download rlfm.zip and see README inside the package for instructions. The package contains C++ source code for constructing and querying versions of FM-index structure. Updated 5.11.2004 with new functionality and several speedups. FM-index was introduced by Ferragina and Manzini, FOCS 2000. The above implementation uses exactly the same search mechanism as proposed by them, but the internal structures are quite different. You might also be interested to see their implementation of FM-index.

Music Information Retrieval

Bioinformatics

  • 2D Electrophoresis Gel Matching software corresponding to CPM 2002 article is implemented using Borland C++ Builder for Windows. It features automatic matching of gel images without needing user-defined landmarks as all the commercial softwares. Also some sort of spot detection is provided. However, the software does not contain all the other handy features of commercial softwares, hence it is more in the prototype status. If you are interested to try it out, or willing to continue its development, ask me for the source.
  • Peak Alignment prototype software corresponding to the Biomolecular Engineering article (2007) is also implemented using Borland C++ Builder. It contains the basic algorithm implementations and some visualizations. Ask me for the source.
  • Mass Spectra Calibration algorithm corresponding to the ACM/IEEE TCBB article (2007) is used in the mass spectra routines developed in Jena. Ask them for the software.
  • Implementation of the algorithm for missing patterns problem corresponding to the WABI 2004 article.
  • Implementation of the compressed suffix tree corresponding to the Bioinformatics & WEA 2007 articles can be found at the project page.

Home