Seminar on Data Compression
Year | Semester | Date | Period | Language | In charge |
---|---|---|---|---|---|
2017 | spring | 16.01-01.05. | 3-4 | English | Juha Kärkkäinen |
Lectures
Time | Room | Lecturer | Date |
---|---|---|---|
Mon 14-16 | C220 | Juha Kärkkäinen | 16.01.2017-27.02.2017 |
Mon 14-16 | C220 | Juha Kärkkäinen | 13.03.2017-24.04.2017 |
General
Lempel-Ziv (LZ) parsing (discovered 40 years ago) is one of the most important algorithmic tools in data compression, both in theory and in the real world.
Completing the course
Mon 30.1.2017
- Introductory lectures on LZ77 compression.
- Assignment 1: Each Student is assigned three open source LZ compressors from the Squash Compression Benchmark. The tasks are:
- Summarize the performance (compression and decompression speed and compression ratio) of your assigned compressors in the Squash benchmark. Note that the benchmark includes many different files and multiple machines, and some compressors have multiple modes/levels. Is the compressor exceptionally good at something?
- Take a brief look at the code and the documentation of each assigned compressors. How easy or difficult it would be to find out exactly what the compressor does and how it achieves it's performance?
- Add your comments on the compressors to the seminar Moodle.
Mon 6.2.2017: Informal meeting for discussion and questions. Not mandatory.
Mon 13.2.2017: Mandatory attendance.
- Oral reports on Assignment 1
- Assignment 2: Each student selects one of the compressors (or groups of compressors) from the "Compressors for Assignment 2" list below or an article about LZ compression and prepares a report and a presentation on the compressor/article. Send your three preferences (compressors and/or articles) to Juha and Simon by Friday, 17.02.
Mon 20.2.: Informal meeting for discussion and questions. Not mandatory.
Mon 27.2.: Informal meeting for discussion and questions. Not mandatory.
Mon 6.3.: Informal meeting for discussion and questions. Not mandatory.
Mon 13.3.: Presentations. Mandatory attendance.
- Tuukka Paukkunen: Density
- Peter Goetsch: Pithy & Snappy
Mon 20.3.: Presentations. Mandatory attendance.
- Jiri Hamberg: wflz
- Simo Salmirinne: Brotli
- Yan He: LZHAM
- Jasu Viding: briefLZ, yalz77, LZJB
Mon 27.3.: Presentations. Mandatory attendance.
- Matti Pulli: Gipfeli
- Pekka Väänänen: lz4
- Olavi Lintumäki: lz4
Mon 3.4.: Informal meeting for discussion and questions. Not mandatory.
Mon 10.4.: Deadline for reports on compressors.
Presentations. Mandatory attendance.
- Mikko Määttä: The use of asymmetric numeral systems as an accurate replacement for Huffman coding
- Kalle Lammenoja: Massively-Parallel Lossless Data Decompression
- Timo Mäki: A Fast Implementation of Deflate
Mon 17.4.: Easter. No meeting.
Mon 24.4.: Presentations. Mandatory attendance.
- Bella Zhukova: Bicriteria data compression: efficient and usable
- Joonas Nietosvaara: Effective Construction of Relative Lempel-Ziv Dictionaries
Mon 1.5.: Deadline for reports on articles.
Literature and material
Compressors for Assignment 1:
- BriefLZ, LZ4, Zstandard
- CSC, LZHAM, LZO
- Brotli, Gipfeli, LZMA
- FastLZ, LZG, zlib-ng
- Density, Pithy, CRUSH
- LZJB, ms-compress, Snappy
- LZF, QuickLZ, wfLZ
- FastLZ, LZO, yalz77
- LZF, LZMA, Zstandard
- LZ4, QuickLZ, Snappy
- Brotli, Gipfeli, wfLZ
- CRUSH, LZG, zlib-ng
- Density, LZHAM, Pithy
- BriefLZ, ms-compress, CSC
- LZ4, LZJB, yalz77
Compressors for Assignment 2:
- Density
- Pithy and Snappy (two related compressors)
- LZHAM
- Gipfeli
- Brotli
- wflz
- zlib-NG
- CSC
- LZ4 (possible to work with a partner)
- LZMA (possible to work with a partner)
- briefLZ, yalz77, LZJB (three simple compressors)
- Z-standard (possible to work with a partner)
Articles for Assignment 2:
-
Danny Harnik, Ety Khaitzin, Dmitry Sotnikov, Shai Taharlev:
A Fast Implementation of Deflate.
DCC 2014: 223-232. -
Jarek Duda, Khalid Tahboub, Neeraj J Gadgil, Edward J. Delp:
The use of asymmetric numeral systems as an accurate replacement for Huffman coding.
Picture Coding Symposium (PCS), 2015, 64-69. -
Evangelia A. Sitaridi, René Müller, Tim Kaldewey, Guy M. Lohman, Kenneth A. Ross:Massively-Parallel Lossless Data Decompression.45th International Conference on Parallel Processing (ICPP), 2016, 242-247
-
Kewen Liao, Matthias Petri, Alistair Moffat, Anthony Wirth:Effective Construction of Relative Lempel-Ziv Dictionaries.25th International Conference on World Wide Web (WWW), 2016, 807-816
-
Andrea Farruggia, Paolo Ferragina, Rossano Venturini:Bicriteria Data Compression: Efficient and Usable.22nd Annual European Symposium on Algorithms (ESA), 2014, 406-417