String Processing Algorithms
5
Algorithms and machine learning
Advanced studies
Basic algorithms and data structures for string processing: exact and approximate string matching, string sorting, dictionary data structures, text indexing.
Lectures
Time | Room | Lecturer | Date |
---|---|---|---|
Tue 12-14 | B222 | Juha Kärkkäinen | 02.11.2010-09.12.2010 |
Thu 12-14 | B222 | Juha Kärkkäinen | 02.11.2010-09.12.2010 |
Exercise groups
Time | Room | Instructor | Date | Observe |
---|---|---|---|---|
Thu 14-16 | B119 | Juha Kärkkäinen | 08.11.2010—10.12.2010 |
Time | Room | Instructor | Date | Observe |
---|---|---|---|---|
Thu 16-18 | BK107 | Juha Kärkkäinen | 08.11.2010—10.12.2010 |
Registration for this course starts on Tuesday 12th of October at 9.00.
General
The course introduces basic algorithms and data structures for string processing including: exact and approximate string matching, string sorting, dictionary data structures and text indexing.
Together with Project in String Processing Algorithms (Period III) this course is one of the three optional course pairs in the subprogram of Algorithms and Machine Learning.
The course is also useful for students in the Master's degree program for Bioinformatics, particularly for those interested in biological sequence analysis.
Literature and material
Lecture notes
All lecture notes: PDF / PS (4 slides/page). Erratum: Algorithm 4.22: Corrected lines (3) and (4).
- Lecture 02.11. (pages 1-20): PDF / PS (4 slides/page). Erratum: Definition of border in Definition 0.1 changed to: If x is both a prefix and a suffix of w, then x is a border of w.
- Lecture 04.11. (pages 21-35): PDF / PS (4 slides/page)
- Lecture 09.11. (pages 36-50): PDF / PS (4 slides/page)
- Lecture 11.11. (pages 51-73): PDF / PS (4 slides/page)
- Lecture 16.11. (pages 74-92): PDF / PS (4 slides/page)
- Lecture 18.11. (pages 93-106): PDF / PS (4 slides/page)
- Lecture 23.11. (pages 107-126): PDF / PS (4 slides/page) Changes: added explanation on page 116 and small corrections on pages 122,123 and 126.
- Lecture 25.11. (pages 127-135): PDF / PS (4 slides/page) Erratum: Algorithm 3.34, line (8): comparison direction corrected to "rlcp > RLCP[mid]"
- Lecture 30.11. (pages 136-151): PDF / PS (4 slides/page)
- Lecture 02.12. (pages 152-164): PDF / PS (4 slides/page) Erratum: Lemma 4.9: "SA" changed to "SA-1"
- Lecture 07.12. (pages 165-180): PDF / PS (4 slides/page) Erratum: Algorithm 4.22: Corrected lines (3) and (4).
- Lecture 09.12. (pages 181-185): PDF / PS (4 slides/page)
Exercises
- Exercise 11.11. Problems | Solutions
- Exercise 18.11. Problems | Solutions
- Exercise 25.11. Problems | Solutions
- Exercise 02.12. Problems | Solutions
- Exercise 09.12. Problems | Solutions
Related books and articles
- M. Crochemore and W. Rytter. Jewels of Stringology. World Scientific Publishing, 2002.
- D. Gusfield. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
- G. Navarro and M. Raffinot. Flexible Pattern Matching in Strings. Cambridge University Press, 2002.
- B. Smyth. Computing Patterns in Strings. Addison Wesley, 2003.
- G. Navarro. A Guided tour to approximate string matching. ACM Computing Surveys 33(1), 2001, pp. 31 - 88.
- S.J. Puglisi, W.F. Smyth, A.H. Turpin. A Taxonomy of Suffix Array Construction Algorithms. ACM Computing Surveys 39(2), Article 4, 2007.
Links