3 Concepts: Information | Projects
1) February 21, 12:00
2) March 7, 12:00
3) March 14, 12:00
Final) March 24, 12:00
R O U N D 1 | |||||||||
---|---|---|---|---|---|---|---|---|---|
Decompressor | curve | genes | income | letters | pretty | TOTAL | |||
1. Team Lyytinen | 11'955 | 110'315 | 107'186 | 4'853 | 132'773 | 101'497 | 468'579 | ||
2. Team Heilimo | 19'520 | 99'872 | 109'980 | 4'852 | 155'445 | 80'470 | 470'139 | ||
3. Team Halmetoja | 11'746 | 102'907 | 108'005 | 4'851 | 148'550 | 112'740 | 488'799 | ||
4. Team Ahosola | 13'044 | 112'234 | 107'734 | 4'422 | 148'565 | 125'017 | 511'016 | ||
5. Team Konttori | 1'385 | 129'812 | 127'969 | 7'167 | 148'669 | 125'362 | 540'364 | ||
6. Team Lampi | 482 | 288'894 | 439'885 | 281'370 | 251'502 | 295'299 | 1'557'432 | ||
7. Team Etelävuori | 5'741 | - | - | - | - | - | - |
No clear winner so far. Teams Lyytinen and Heilimo both start strong but with different strengths. Team Lyytinen manages to snag the victory of round 1, but the game is only beginning and I am sure team Lyytinen must continue to work hard to maintain their lead. Interestingly teams' decompression programs are much larger than in the previous competition, but this is understandable because larger datafiles give more leeway. Team Etelävuori did not return a working program for this round, and team Lukk did not return anything.
There will be no new data for round 2. I will study the results and your solutions and perhaps add more data to compress for round 3.
R O U N D 2 | |||||||||
---|---|---|---|---|---|---|---|---|---|
Decompressor | curve | genes | income | letters | pretty | TOTAL | |||
1. Team Ahosola | 4'833 | 88'901 | 107'745 | 4'421 | 116'489 | 78'237 | 400'626 | ||
2. Team Heilimo | 23'706 | 91'551 | 107'752 | 4'853 | 137'951 | 67'740 | 433'553 | ||
3. Team Lyytinen | 15'552 | 110'315 | 107'186 | 4'853 | 132'773 | 72'953 | 443'632 | ||
4. Team Konttori | 4'706 | 91'809 | 108'007 | 4'850 | 115'217 | 125'255 | 449'844 | ||
5. Team Halmetoja | 14'054 | 102'907 | 108'005 | 4'851 | 148'550 | 87'162 | 465'529 | ||
6. Team Etelävuori | 7'372 | 101'074 | 109'972 | 6'106 | 150'809 | 94'958 | 470'291 |
Our epic battle is clearly heating up. Most teams perform better this round than the best result of the previous round. Will the trend continue? Teams Heilimo and Lyytinen continue close each other but team Ahosola surprises them from behind with impressive overall performance and becomes the undisputed champion of round 2.
I must confess that I am a bit surprised by the great differences in the sizes of the decompression programs. Luckily it is time to add more data and more data of course means that decompressor size becomes less important. In the tar below you will find new datafiles markov and herring. After this there will probably be no more data. Will these new datafiles prove to be decisive? Time will tell.
R O U N D 3 | |||||||||
---|---|---|---|---|---|---|---|---|---|
Dec. | curve | genes | herring | income | letters | markov | pretty | TOTAL | |
1. Team Lyytinen | 17'857 | 110'315 | 107'186 | 70'634 | 4'853 | 132'773 | 53'906 | 72'953 | 570'477 |
2. Team Ahosola | 5'357 | 88'901 | 107'745 | 102'579 | 4'421 | 116'489 | 167'050 | 77'181 | 669'723 |
3. Team Konttori | 6'969 | 88'563 | 107'446 | 138'004 | 4'079 | 110'160 | 162'408 | 85'227 | 702'856 |
4. Team Heilimo | 19'679 | 91'551 | 107'752 | 148'564 | 4'853 | 105'616 | 165'703 | 64'985 | 708'703 |
5. Team Halmetoja | 21'546 | 91'812 | 108'005 | 138'008 | 4'851 | 110'343 | 156'732 | 87'162 | 718'459 |
6. Team Etelävuori | 8'330 | 101'074 | 109'972 | - | 6'106 | 150'809 | - | 94'958 | - |
Team Lyytinen returns to lead, this time with much better result than all the rest of the teams. Team Lyytinen's position is far from secure however, as other teams make good progress on other fronts and if they manage to duplicate Team Lyytinen's success with the new datafiles they may easily surpass Team Lyytinen, or at least Team Lyytinen's current results. Maybe other teams just plan to give each other false sense of security before their final lunge. Tune in next week to find out the winner!
No completely working solution from Team Etelävuori this round.
F I N A L R E S U L T S | |||||||||
---|---|---|---|---|---|---|---|---|---|
Dec. | curve | genes | herring | income | letters | markov | pretty | TOTAL | |
1. Team Ahosola | 18'978 | 88'744 | 107'745 | 38'119 | 3'937 | 96'560 | 50'230 | 74'287 | 478'600 |
2. Team Halmetoja | 15'156 | 91'607 | 108'005 | 71'396 | 4'851 | 110'343 | 54'682 | 69'816 | 525'855 |
3. Team Heilimo | 14'106 | 91'551 | 107'752 | 93'795 | 4'853 | 104'840 | 53'814 | 63'209 | 533'920 |
4. Team Konttori | 5'796 | 88'563 | 107'446 | 94'534 | 4'079 | 110'160 | 53'416 | 85'227 | 549'221 |
5. Team Lyytinen | 17'851 | 110'315 | 107'186 | 70'634 | 4'853 | 132'773 | 53'906 | 72'953 | 570'471 |
6. Team Etelävuori | 10'780 | 101'074 | 109'972 | 140'424 | 6'106 | 134'694 | 64'643 | 94'959 | 662'652 |
And the winner is... Team Ahosola! Team Ahosola has the best compression rate in most cases so there is no question of which team is the best this time. The results of the other teams are not too shabby either, as again almost everybody does better than the best result of the previous round. A honorable mention goes to Team Heilimo for the nice tricks they used in compressing pretty.dat.
Congratulations to everybody and particularly to the winners!
Your task is to design, document, and implement a compressor and a decompressor program. All imaginable lossless compression methods are allowed. The compression rate of the program will be evaluated with a given set of data files. However, scoring will not be based solely on compression rate. Instead the size of the decompressor program is added to the size of the compressed files. Using too complex a decompressor program (e.g., a la style ''print "data...";'') will not generally be a good idea.
As a special feature, in some cases both the compressor and the decompressor programs have access to side information related to the file to be compressed. Using the side information is not obligatory, but it makes compression significantly easier, thus reducing the size of the compressed file.
Your documentation, due March 24, should describe
the ideas/observations/principles underlying the compressor and the
decompressor algorithms. A complete project is a tar-file (all links
should be relative and all files in the directory returned)
including
- high-level description (in English) of the algorithms,
- program source for both compressor and
decompressor programs.
The documentation can be delivered as a separate tar-file that can be extracted in the same directory as the programs.
In order to make the sizes of the programs comparable, programming language is Java for all groups (sorry about that). The programs must compile in Linux with the Makefile given in the project template. The usage of the programs must conform with the test script also found in the template. In particular, input is read from the standard input stream and a side information file specified on the command line (e.g., ''java mydecompress sideinfo.sdat <compressed.data >decompressed.data''). See the README file.
The source you return must be the real source of the program. This means the source you work with when modifying the program yourself. That real source must compile without extra steps to other people. You may modify Makefile if you wish.
The form of the report is a WEB-page (the HTML template is included in the project template).
- Download the project template
If you run into problems, email questions to Tomi.
The performance of your algorithm will be evaluated with the given set of data files. The loss-function is given by:
where N is the number of data files, is the size of the decompressor program, and is the size of the compressed version of the ith file.
The competition is iterative, meaning that there are several deadlines. After each deadline your success is evaluated and scores are published. Project points will be based only on the final round scores though.
1. Can the decompression program use other input than stdin and the side information file given on the command line?
No.
2. Can I use Java standard library java.___.___?
Yes.
3. Do I have use to Java standard library java.___.___? My class file gets too big.
No.
4. Making the class files small is a question of good/bad/ugly Java programming skills, not compression!
Unfortunately this is more or less the case. However, we expect the differences due to compression rate to be more significant than the differences due to programming style. If this turns out to be a problem, we will increase the sizes of the data files or think of something else. Besides, think of the bright side of things: We first thought of comparing the sizes of the source files! (See IOCCC.)
5. Can the compressor and decompressor programs create and access temporary files in /tmp or elsewhere?
Yes.
6. Do the programs have to be general purpose applications, or does it suffice to be able to handle the given files?
Just the given files, with the given script (runtest.sh).
3 Concepts: Information |