Project in String Processing Algorithms

582668
2
Algorithms and machine learning
Advanced studies
Implementation and experimental comparison of string processing algorithms, and presentation of the results.

Exam

02.32
Year Semester Date Period Language In charge
2011 spring 18.01-22.02. 3-3 English Juha Kärkkäinen

Lectures

Time Room Lecturer Date
Tue 12-14 B119 Juha Kärkkäinen 18.01.2011-22.02.2011

General

The project consists of

  • implementation of one or more string processing algorithms
  • experimental comparison and/or analysis of the algorithm(s)
  • presentation of the results as a poster

The project can be done in groups of at most four students. In a group each student is responsible for specific algorithms and the group together is responsible for the experiments and the poster.

Suitable topics include:

  • exact string matching
  • multiple exact string matching
  • approximate string matching
  • string sorting
  • search trees for strings

Other topics are possible too.
 

Completing the course

Algorithm implementation

The algorithms can be implemented with any programming language under the restriction that the programs can be compiled and executed on the department computers. Members of a group should use the same language.

The algorithm implementations are returned to the instructor by noon on Fri 18.2.. In a group, each student returns her or his implementations separately. See the opening slides below for more details.

Experiments

The purpose of the experiments is to determine how the performance of the algorithms changes with different inputs, different parameters settings, different algorithms etc.. An important part is choosing the test data.

Poster

The results of the experiments are presented as a poster. There will be an open poster presentation session where other students and staff of the department can come to view the posters and ask questions.

The poster may be A0 or A1size split into A3s or A4s (see example LaTeX poster below) or collection of separate A3s or A4s. Poster boards and pins are provided. The boards are large enough to hold an A0 in either orientation.

The poster session takes place Wed 2.3. at 11-15 in B222. Preliminary program:

  • 11-12 Assembling poster boards and setting up the posters
  • 12-13 Lunch break and viewing other posters
  • 13-14 Poster session open to public
  • 14-15 Taking down posters

Grading

Each part of the project (implementation, experiments, poster) contributes one third to the total score. In general,
the experiment and poster score will be the same for all members of a group. The implementation scores will be personal.
 

Important Dates

The implementation (with documentation) must be returned by noon on Friday 18.2.

The project presentation takes place in on Wednesday 2.3. at 11-15 in B222.

 

Results

The three columns "Työt" correspond to the three parts of the project, implementation, experiments and poster. The maximum for each part is 12 points.

Literature and material

Opening slides: PDF | PS (4 slides/page)

Example poster: example_poster.tgz (unpack with tar xvzf example_poster.tgz and see the file README)

 

Test data

Pizza&Chili Corpus and Repetitive Corpus