University homepage Suomenkielinen versio puuttuu Inte på svenska In english
University of Helsinki Department of Computer Science
 

Department of Computer Science

Succinct Data Structures (SuDS) -research group

Group description

The research group studies a new subfield of data compression - data structure compression. The new aspect compared to traditional compression is that the compressed data (structure) needs to be represented so that access to its internal parts is provided without uncompressing the whole structure. As an example, consider a binary tree of n nodes. It is possible to represent the tree succinctly using about 2n bits so that the children and parent of any node can be accessed in constant time. A standard link structure representation of a binary tree takes of order n log n bits.

In addition to providing new algorithms and data structures in the field of study, the group contributes by engineering open source implementations targeted to applications such as sequence analysis in Bioinformatics, full-text search in Database Systems, and retrieval of structured documents in Information Retrieval.

The group is a member of Helsinki Institute for Information Technology and Finnish Centre of Excellence for Algorithmic Data Analysis Research.

News

  • The group participates in the new Finnish Centre of Excellence in Cancer Genetics Research led by Academy Professor Lauri Aaltonen, starting 2012. The name of the group will change into genome-scale algorithmics, better describing our current focus.

People

Visitors and alumni

  • Santeri Pietilä, Research Assistant
  • (Spring/Summer 2011)
  • Juha Karjalainen, Research Assistant
  • (Autumn 2010), now PhD student at the University of Groningen.
  • Riku Katainen, Research Assistant (Summer 2008-Summer 2010), now at Aaltonen lab
  • Susana Ladra, PhD student, University of A Coruna, Spain (August - November, 2009)
  • Antti Laaksonen, Research Assistant
  • (summer 2009)
  • Kashyap Dixit, MSc student, IIT Kanpur, India (May 2006 - July 2006)
  • Wolfgang Gerlach, Diploma student, Bielefeld University, Germany (August 2006 - September 2006)
  • Johannes Fischer, PhD student, LMU München, Germany (June 2007)

Collaboration

The project collaborates with several researchers abroad. A close and long-term collaboration is with Professor Gonzalo Navarro from University of Chile.

Software

Recent Publications

  • Paolo Ferragina, Jouni Sirén, and Rossano Venturini: Distribution-Aware Compressed Full-Text Indexes.
    Proc. ESA 2011, Springer LNCS 6942, pp. 760-771, Saarbrücken, Germany, September 5-7, 2011.
  • Jouni Sirén, Niko Välimäki, and Veli Mäkinen: Indexing Finite Language Representation of Population Genotypes.
    Proc. WABI 2011, Springer LNCS 6833, pp. 270-281, Saarbrücken, Germany, September 5-7, 2011.
  • Niko Välimäki, Susana Ladra, and Veli Mäkinen: Approximate All-Pairs Suffix/Prefix Overlaps.
    In Proc. 21st Annual Symposium on Combinatorial Pattern Matching (CPM 2010), Springer-Verlag, LNCS 6129, pp. 76-87, New York, USA, 21-23 June 2010.
    [Article online]
    [Implementation]
  • Jouni Sirén: Sampled Longest Common Prefix Array.
    In Proc. CPM 2010, Springer LNCS 6129, pp. 227-237, New York, USA, June 21-23, 2010.
    [Article]
    [Preprint]
    [Slides]
  • Veli Mäkinen, Niko Välimäki, Antti Laaksonen, and Riku Katainen: Unified View of Backward Backtracking in Short Read Mapping.
    In Ukkonen Festschrift 2010 (Eds. Tapio Elomaa, Pekka Orponen, Heikki Mannila), Springer-Verlag, LNCS 6060, pp. 182-195, 2010.
    [Article online]
    [Implementation]
  • Veli Mäkinen, Gonzalo Navarro, Jouni Sirén, and Niko Välimäki: Storage and Retrieval of Highly Repetitive Sequence Collections.
    Journal of Computational Biology 17(3):281-308, 2010.

  • Diego Arroyuelo, Francisco Claude, Sebastian Maneth, Veli Mäkinen, Gonzalo Navarro, Kim Nguyên, Jouni Sirén, and Niko Välimäki: Fast In-Memory XPath Search over Compressed Text and Tree Indexes.
    In Proc. ICDE 2010, IEEE, pp. 417-428, Long Beach, California, USA, March 1-6, 2010.
  • Johannes Fischer, Veli Mäkinen, and Gonzalo Navarro: Faster Entropy-Bounded Compressed Suffix Trees.
    Theoretical Computer Science, 410(51):5354-5364, 2009.
  • Niko Välimäki, Veli Mäkinen, Wolfgang Gerlach, and Kashyap Dixit: Engineering a Compressed Suffix Tree Implementation.
    ACM Journal of Experimental Algorithmics, 14:4.2-4.23, 2009
    [Preprint]
    [Implementation]
  • Jouni Sirén: Compressed Suffix Arrays for Massive Data.
    In SPIRE 2009, Springer LNCS 5721, pp. 63-74, Saariselkä, Finland, August 25-27, 2009.
  • Veli Mäkinen, Gonzalo Navarro, Jouni Sirén, and Niko Välimäki. Storage and Retrieval of Individual Genomes.
    In Proc. 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2009), Springer-Verlag LNCS 5541, pp. 121-137, Tucson, Arizona, May 18-21, 2009.
    [Article online]
    [Preprint]
    [Implementation]
  • Johannes Fischer, Veli Mäkinen, and Niko Välimäki: Space-Efficient String Mining under Frequency Constraints.
    In Proc. Eighth IEEE International Conference on Data Mining (ICDM 2008), IEEE Computer Society, pages 193-202, Pisa, Italy, December 15-19, 2008.
    [Article online]
    [Preprint]
    [Implementation]
  • Veli Mäkinen and Gonzalo Navarro: Dynamic Entropy-Compressed Sequences and Full-Text Indexes.
    ACM Transactions on Algorithms, 4(3):article 32, 2008.
    [Preprint]

Funding

Previous update: 13.04.2012 - Veli Mäkinen