String Processing Algorithms
Exam
Year | Semester | Date | Period | Language | In charge |
---|---|---|---|---|---|
2011 | autumn | 01.11-08.12. | 2-2 | English | Juha Kärkkäinen |
Lectures
Time | Room | Lecturer | Date |
---|---|---|---|
Tue 12-14 | B222 | Juha Kärkkäinen | 01.11.2011-08.12.2011 |
Thu 12-14 | B222 | Juha Kärkkäinen | 01.11.2011-08.12.2011 |
Mon 12-14 | CK111 | Juha Kärkkäinen | 05.12.2011-05.12.2011 |
Exercise groups
Time | Room | Instructor | Date | Observe |
---|---|---|---|---|
Tue 16-18 | B222 | Juha Kärkkäinen | 31.10.2011—09.12.2011 |
Time | Room | Instructor | Date | Observe |
---|---|---|---|---|
Wed 12-14 | B119 | Juha Kärkkäinen | 31.10.2011—09.12.2011 |
Registration for this course starts on Tuesday 11th of October at 9.00.
General
RESULTS
Exam 15.12.2011: problems | solutions | results
Contact the lecturer for individual feedback.
Separate exams
03.02.2012: problems | solutions | results | grades (Yht = max(KoeS+LhP, KoeS*1.2)
03.04.2012: problems | solutions | grades
14.08.2012: problems | results | grades
FINAL NOTES
- Please give feedback for the course. You may use the anonymous course feedback form or give direct feedback. The feedback will be used for improving this and other courses including the Data Compression Techniques course in the next period.
- Don't forget the Project in String Processing Algorithms in the next period.
The course introduces basic algorithms and data structures for string processing including: exact and approximate string matching, string sorting, dictionary data structures and text indexing.
Together with Project in String Processing Algorithms (Period III) this course is one of the three elective course pairs in the subprogram of Algorithms and Machine Learning.
The course is also useful for students in the Master's degree program for Bioinformatics, particularly for those interested in biological sequence analysis.
The students are expected to have basic knowledge on algorithms, data structures, finite automata and algorithm analysis. Recommened prequisite course are Data Structures, Models of Computation, and Design and Analysis of Algorithms.
A useful follow-up course is Data Compression Techniques (Period III).
Completing the course
The course consists of lectures, exercises and an exam.
Lectures: Tue and Thu 12-14 in B222.
Attending lectures is not obligatory but it is useful. Lecture notes covering key facts will be posted on this page, but there will be additional examples and explanations during the lectures.
Exercises: Tue 16-18 in B222.
Exercise problems will be available on this page about a week before each exercise session. Model solutions will be available after the exercise session.
The students should solve the problems at home and be prepared to present their solutions at the exercise session. The students are not required to solve all problems, but additional points are awarded according to how many problems have been solved:
- 7 marked problems is required and gives 1 point.
- 29 marked problems gives the maximum of 10 points.
- 10, 12, 15, 17, 19, 22, 24, 27 are the minimums for 2-9 points.
Please check the list of marked problems for your final saldo.
Exam: Thu 15.12. at 16:00 in A111.
The course exam consists of problems similar to those in the exercises. No notes or other material is allowed in the exam.
- The maximum score is 50 points.
- At least 25 points is required to pass the course.
Passing the course or raising the grade is possible later at a renewal exam (where the exercise points affect the grade as with the course exam) and at separate exams (where exercise points are ignored).
See last year's course for examples of exam problems.
Grading
The grading is based on the points from the exercises and the exam.
- 30 points is required to pass and gives the lowest grade 1.
- 50 points gives the highest grade 5.
Literature and material
The course does not follow any book. The books can be useful supplementary material but are not required reading.
All essential material can be found in the lecture notes and in the exercise problems and solutions.
Lectures notes
- Lecture 01.11. (pages 1-26): PDF | PS (4 slides/page)
- Lecture 03.11. (pages 27-40): PDF | PS (4 slides/page) Errata: wrong symbols k and P in Alg. 2.2. Change: ternary trees renamed ternary tries and slides 41-46 moved to Lecture 3.
- Lecture 08.11. (pages 41-65): PDF | PS (4 slides/page)
- Lecture 10.11. (pages 66-80): PDF | PS (4 slides/page)
- Lecture 15.11. (pages 81-92): PDF | PS (4 slides/page)
- Lecture 17.11. (pages 93-108): PDF | PS (4 slides/page)
- Lecture 22.11. (pages 109-124): PDF | PS (4 slides/page)
- Lecture 24.11. (pages 125-140): PDF | PS (4 slides/page) Erratum: wrong algorithm number on slide 136. Added slides 137-140.
- Lecture 29.11. (pages 141-164): PDF | PS (4 slides/page)
- Lecture 01.12. (pages 165-180): PDF | PS (4 slides/page)
- Lecture 01.12. (pages 181-189): PDF | PS (4 slides/page)
Exercise problems and solutions
- Exercise 1 (01.11.) Problems
- Exercise 2 (08.-09.11.) Problems
- Exercise 3 (15.-16.11.) Problems
- Exercise 4 (22.-23.11.) Problems
- Exercise 5 (29.-30.11.) Problems
- Exercise 6 (07.12.) Problems
Related books and articles
- M. Crochemore, C. Hancart and T. Lecroq. Algorithms on Strings. Cambridge University Press, 2007.
- M. Crochemore and W. Rytter. Jewels of Stringology. World Scientific Publishing, 2002.
- D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
- G. Navarro and M. Raffinot. Flexible Pattern Matching in Strings. Cambridge University Press, 2002.
- B. Smyth. Computing Patterns in Strings. Addison Wesley, 2003.
- G. Navarro. A Guided tour to approximate string matching. ACM Computing Surveys 33(1), 2001, pp. 31 - 88.
- S.J. Puglisi, W.F. Smyth, A.H. Turpin. A Taxonomy of Suffix Array Construction Algorithms. ACM Computing Surveys 39(2), Article 4, 2007.
Links