Project in String Processing Algorithms

582668
2
Algorithms and machine learning
Advanced studies
Implementation and experimental comparison of string processing algorithms, and presentation of the results.
Year Semester Date Period Language In charge
2012 spring 17.01-24.02. 3-3 English Veli Mäkinen

Lectures

Time Room Lecturer Date
Thu 10-12 C222 Veli Mäkinen 23.02.2012-23.02.2012

General

The project consists of

  • implementation of one or more string processing algorithms
  • experimental comparison and/or analysis of the algorithm(s)
  • presentation of the results

The project can be done in groups of at most four students. In a group each student is responsible for specific algorithms and the group together is responsible for the experiments and the presentation.

Course assumes String Processing Algorithms -course or similar knowledge as background. Topics include string/suffix sorting, extensions and applications of index structures, approximate string matching, etc. In addition, there will be some special topics related to biological sequence analysis tailored for bioinformatics students.

  • List of topics
  • Other topics related to the String Processing Algorithms -course are also possible.
     

 

Completing the course

Algorithm implementation

The algorithms can be implemented with any programming language under the restriction that the programs can be compiled and executed on the department computers.

The algorithm implementations are returned to the instructor by noon on Fri 17.2. The contributions of each group member should be stated clearly. See the opening slides below for more details.

Experiments

The purpose of the experiments is to determine how the performance of the algorithms changes with different inputs, different parameters settings, different algorithms etc.. An important part is choosing the test data.

Presentation

(poster changed into slide show presentation, see below)

Each group gives a presentation of the chosen algorithms, implementations, and  results of the experiments in the end of the project. This presentation session will be open to other students and staff of the department.

The presentation session takes place Thursday 23.2 at 10-12 in C222

Preliminary program:

  • 10.15 Presentation by JR
  • 10.50 Presentation by HD and DK
  • 11.25 Presentation by AK

Grading

Each part of the project (implementation, experiments, presentation) contributes one third to the total score. In general,
the experiment and presentation score will be the same for all members of a group. The implementation scores will be personal.
 

Important Dates

Meetings every week at room A239b: JR Tue 14.15, AK Tue 14.45,  HD & DK Mon 14.00

The implementation (with documentation) must be returned by noon on 17.2.

The project presentation takes place in on Thursday 23.2 at 10-12 in C222.

 

Literature and material

Opening slides: PDF

Test data

Pizza&Chili Corpus and Repetitive Corpus