Discrete Optimization Project (Spring 2012, Lecture Period III)
This is an advanced level course within the specialization area Algorithms and Machine Learning. It is expected that the participants have
taken the course Discrete Optimization preferrable in 2011.
You can refresh your memory on the topic using the 2011 lecture slides.
Contents:
Independent hands-on project work in the areas of constraint satisfaction and discrete optimization, involving problem modelling, algorithm implementation, and/or practical performance evaluation of search and optimization algorithms within a topic chosen by each student. Each student reports his/her project work as a written scientific report and in a presentation given to the other students.
Course Information
- Meetings: Wednesdays at 10-12 in Exactum, seminar room B119, only on the following dates
- Introductory lecture (distribution of topics): January 18
- Half-point sessions (progress checks, feedback): February 1
- Final session (talks by students): February 22
- Course code: 582633
- Credit units: 2 ECTS
- Grading: on the scale 0-5: 90% project work + final report; 10% presentations
- Lecturer: Dr. Matti Järvisalo, matti.jarvisalocs.helsinki.fi
- Language: The course is given in English by default (given that there are participants who do not speak Finnish).
- Prerequisites:
It is expected that the participants have
taken the course Discrete Optimization (preferrable in 2011).
Courses from the CS curriculum which provide useful background include
Design and Analysis of Algorithms and
Introduction to Artificial Intelligence.
Programming skills are also needed.
Course Requirements and Timeline
Note: The course schedule is very tight, so you should start your project work as soon as possible!
January 18: Introductory lecture, distribution of topics.
- You will learn about the possible project topics.
- You should discuss what you want to do with the lecturer.
- You have the opportunity to propose your own topic! This is highly recommended in case you have a specific topic in mind.
January 23 (Monday): deadline for choosing a project topic
- Send your short preliminary project proposal to the lecturer by email.
- What are you planning on doing: what will you implement? What problem will you solve / model? What will you compare against to?
February 1: Half-point meeting
- Make an overview of 2-3 slides on what your are doing and will do.
- Present you plan to the other students in ~5 minutes.
February 22: Final meeting
- Make a 10-15 minute presentation on your project to the other students, explaining what you did, and what the results are.
February 29: Deadline for the project report
- Write a 5-10 page report on your project
- Explain:
- What is the problem you are solving interesting?
- How did you approach the problem?
- Experimental results
- Analysis of the results. Looking back, how would you improve on what you did?
- Submit both the report and and the source code of what you have implemented by email to the lecturer. Also submit a working binary (if applicable) of your implementation, and instructions on how to run the software.
Possible Project Topics
Note: The project topics are open-ended, you can to a great extend choose what you will work on and how. Notice that the lecturer will not assist in problems related to implementation-level work (implementation details/bugs, compilation problems, setting up scripts for experiments, ...). However, you are encouraged to discuss the actual project topic (e.g., which solvers to use / which algorithm(s) to implement / what problem to model / which benchmarks to use in the experiments / what type of an experimental setup to use / etc).
1. Implement a search / optimization procedure
- For SAT, CSP, LP, IP/MIP, or for a specific problem
- Complete search (backtracking/branch-and-bound) or local search
- Implement one or preferrable multiple search heuristics (especially for local search)
- Concrete examples:
- DPLL for SAT, Simplex for LP, branch-and-bound or local search methods for your favourite search or optimization problem (could be e.g. TSP, graph coloring)
- Or implement an optimization procedure on top of a search procedure.
- Concrete example: Take the Minisat SAT solver, and build an optimization procedure on to of it (for example, for MaxSAT or your favourite optimization problem that can be solving using SAT). Use the incremental interface of the solver of possible.
- Experimentally evaluate the efficiency of your implementation by running it on a set of benchmarks
- Bonus: Compare the efficiency of your implementation against other implementations (one or more) available on the web
- Use either randomly generated benchmarks or application benchmark sets available on the web
- Report:
- main aspects of your implementation and explain the heuristics
- results of the experimental evaluation
- Bonus:
try to analyze how your implementation differs to other available implementations of the same / a similar algorithm
2. Try to Improve an available solver implementation
- Select a solver implementation that is available in open source on the web
- Understand how the solver works
- Modify some part(s) of the solver, with the aim of improving the efficiency of the solver. For examples:
- Come up with alternative search heuristics and implement them by modifying the solver
- Try to improve the running time of the solver by re-implementing its data structure to be more efficient
- ...
The modifications can be
aimed at improving the running time of the solver on a specific problem domain
- Concrete example:
- Take the CDCL SAT solver Minisat. Figure out how the solver works. Choose a benchmark set that encodes a specific problem domain in SAT (see, e.g., the benchmarks used in the SAT Competition). Come up with an alternative decision heuristic / restarts schedule / learning scheme, or the like, aimed specifically at the problem domain you chose. Implement your technique.
- Report:
- explaing the new ideas and how you implemented them
- results of the experimental comparison
- try to analyze why your ideas work (or do not work)
3. Problem modelling
- Choose a search or optimization problem of your choice (can be one from the course Discrete Optimization or any other interesting problem, e.g. from bioinformatics).
- Choose a constraint language of your choice (e.g., MIP, IP, LP, SAT, CP), and develop multiple different ways of modelling the chosen problem in the language.
- Or Choose multiple constraint languages, and model the chosen problem in each of the languages.
- Make an experimental evaluation of your encodings using available solvers for the chosen constraint language(s).
- Report:
- how you modelled the problem, and why the models work
- results of the experimental comparison
- try to analyze why one model works better/worse than another
4. Comparison of different types of search / optimization techniques
- Choose a constraint language of your choice
- For examples: SAT, IP/MIP, CSP, SMT, ...
- Choose a set of available solvers for the language, each implementing different types of algorithms (e.g., different types of complete search and local search, different heuristics, ...).
- Understand how the solvers differ form each other
- Choose a set of different available benchmarks in the language (modelling different problems, real-world, puzzles, randomly generated instances).
- Run all the chosen solvers on all the benchmarks.
- Understand why different solvers work better than others on specific types of benchmarks, and why some solvers are particularly bad in solving specific types of benchmarks.
- Report:
- which solvers did you choose and why
- results of the experimental comparison
- an analysis on why different solvers work better than others on specific types of benchmarks, and why some solvers are particularly bad in solving specific types of benchmarks.
5. Repeat the experiments of a scientific article that uses constraint solvers to solve a problem
- Find a recent scientific article that uses constraint solvers (SAT,SMT, CP, IP, MIP, ...) to solve an interesting hard problem efficiently
- Repeat (and possibly extend) the experiments done in the article, answering the following questions:
- Did you get similar results as the ones presented in the paper? If not, why so?
- Try to extend the experimental evaluation by using additional benchmarks (invent your own or/and look for additional benchmarks on the web) to answer: how well does the approach work in a more general setting?
- How would you try to improve on the proposed approach?
- Report:
- what was the problem solved
- how was is solved in the article, and how does the proposed solution compare to previous work
- results of your experimental evaluation
- analysis of the results, answering the
- possibly: your own ideas for further improving the approach suggested in the article
Practical Instructions
- For implementation work, you may use any programming language you want. However, C or C++ is strongly recommended if you are familiar with it. Especially, many available solver implementations are implemented in C or C++.
- For the experimental evaluation, you can use the Ukko cluster.
- You need to write small scripts for running the experiments, so you need to know a bit how to do this.
- Bare in mind that others are also using the cluster simultaneously. Especially:
Make sure that you will not crash a cluster node by e.g. running it out of memory. You should use the ulimit command to restrict both the memory consumption and the running time for your experiments. For example, executing ulimit -v 2000000 -t 600 will restrict the memory consumption to 2 GB and the running time to 600 seconds (10 minutes) for any process you will launch afterwards in the particular node within the same session.
- For measuring the running time of a process, you should use the Unix time command, and report the "user" time reported by the command.
- For writing the report, one possibility is to use the latex style used for Bachelor's theses at the CS department.
Some related tips can be found here (in Finnish, sorry...).
- You should write the report using scientific conventions, including that you provide the necessary references and links to the web sources that you used.
- The page limit for the report is not strict. Especially, if you have a lot of important plots representing the experimental results, you put them in an appendix in the report and use more pages for the appendix. However, include only relevant information in the report.
Relevant Links to Constraint Solvers, Benchmarks, etc