Data Compression Techniques

582487
5
Algorithms and machine learning
Advanced studies
Techniques for data compression with an emphasis on text compression including: Huffman and arithmetic coding, Ziv-Lempel compression, Burrows-Wheeler transform and compressed text indexes.

Exam

05.03.2015 16.00 B123
Year Semester Date Period Language In charge
2015 spring 13.01-26.02. 3-3 English Simon Puglisi

Lectures

Time Room Lecturer Date
Tue 14-16 B222 Simon Puglisi 13.01.2015-26.02.2015
Thu 14-16 B222 Simon Puglisi 13.01.2015-26.02.2015

Exercise groups

Group: 1
Time Room Instructor Date Observe
Wed 10-12 B222 Anna Kuosmanen 19.01.2015—27.02.2015

Literature and material

 
ATTENTION!!!: The project course (Project in Data Compression Techniques) has it's first meeting on Friday, 13.03.2015. Hope to see you there. The meeting will give an overview of what is expected.
 
The course does not follow any book. All essential material can be found in the lecture notes and in the exercise problems and solutions.
Lecture Notes
1. Lecture 13.01: PDF.
2. Lecture 15.01: PDF.
3. Lecture 20.01: PDF.
4. Lecture 22.01: PDF.
5. Lecture 27.01: PDF.
6. Lecture 29.01: PDF.
7. Lecture 03.02: PDF.
8. Lecture 10.02.: PDF (old slides)
9. Lecture 12.02.: PDF (old slides)
10. Lecture 19.02: PDF.
11. Lecture 24.02.: PDF
 
For the slides of the previous incarnation of the course, see http://www.cs.helsinki.fi/courses/582487/2012/k/k/1 .
 
Exercises
1. Exercise Sheet 1, 21.01: PDF. [solutions]
2. Exercise Sheet 2, 28.01: PDF. [solutions]
3. Exercise Sheet 3, 04.02: PDF. [solutions]
4. Exercise Sheet 4, 11.02: PDF. [solutions]
5. Exercise Sheet 5. 18.02.: PDF. [solutions]
6. Exercise Sheet 6, 25.02: PDF[solutions]

Topics for exam:

From Simon:
- Integer codes: Elias gamma, Vbyte, Simple9 (word-aligned), interpolative
- LZ77
- RLZ
- level-order representation of binary trees

From Travis:
- the definition of entropy
- the statement (but not the proof) of Shannon's Noiseless Coding Theorem
- the statement (and some idea of the proof) of the Kraft Inequality
- Huff man's Algorithm (and some idea of the proof of correctness)
- how to make a prefix-free code canonical
- move-to-front (but not the analysis)
- arithmetic coding (but not the analysis)
- how to compute and invert the BWT (and have some idea of how it works)
- wavelet trees (but not the space analysis)
- FM-indexes (backward stepping, counting)
- bounds for RMQs

From both:
- applying rank and select on bitvectors (but without having to know all the details of the succinct/compressed solutions)

 

FINAL NOTE: Please give feedback for the course. You may use anonymous feedback form or give direct feedback.

ATTENTION!!!: The project course (Project in Data Compression Techniques) has it's first meeting on Friday, 13.03.2015. Hope to see you there. The meeting will give an overview of what is expected.