Reinforcement Learning Project

582376
4
Algoritmit ja koneoppiminen
Syventävät opinnot
Reinforcement learning (RL) is a branch of semisupervised learning that specializes in control problems and online learning problems (learning works with a single example supplied at a given time). RL-agents have been successfully applied in robotics, game AI (Go, chess, strategy games) and personalizing user experience. The project is recommended to be followed alongside the Seminar: Reinforcement Learning and Its Applications (http://www.cs.helsinki.fi/courses/58314105/2014/k/s/1). Most of the theoretical aspects of reinforcement learning will be covered during the seminar, where a number of practical applications will be discussed as well. The course has one mandatory meeting at the beginning of the project and one optional presentation meeting at the end. Individual meetings and help will be available at regular intervals.
Vuosi Lukukausi Päivämäärä Periodi Kieli Vastuuhenkilö
2014 kevät 15.01-23.04. 3-4 Englanti Dorota Glowacka

Luennot

Aika Huone Luennoija Päivämäärä
Ke 14-16 B119 Dorota Glowacka 15.01.2014-15.01.2014
Ke 14-16 B121 Dorota Glowacka 22.01.2014-22.01.2014

Yleistä

Reinforcement learning (RL) is a branch of semisupervised learning that specializes in control problems and online learning problems (learning works with a single example supplied at a given time). RL-agents have been successfully applied in robotics, game AI (Go, chess, strategy games) and personalizing user experience.

The project is recommended to be followed alongside the Seminar: Reinforcement Learning and Its Applications (http://www.cs.helsinki.fi/courses/58314105/2014/k/s/1)
Most of the theoretical aspects of reinforcement learning will be covered during the seminar, where a number of practical applications will be discussed as well.

The course has one mandatory meeting at the beginning of the project and one optional presentation meeting at the end. Individual meetings and help will be available at regular intervals.

Assignment description:

During the project each student chooses a subject, discusses it with the instructor and implements it. The subject may be one suggested below, or one of their own choosing.
The students are required to return a working implementation with documentation covering the topic, usage and proof that it works.
 

Kurssin suorittaminen

First meeting on 15.1. 14-16, after the seminar lecture in B119 (we leave together after the lecture). Enrollment may be done during this meeting.

Second meeting 22.1. 14-16, after seminar lecture in B121!

Deadline for topic description is 26.1., send via the same email as the seminar's topic description along with topics you are interested in seeing dorung the next lecture. Description should include the plan for what environment you are using, how you frame it as an MDP/RL-problem (what are states, actions, rewards, transitions, what you excpect to happen with learning etc.), possible algorithms used and resources you use to do this.

The objective of the project is to implement a reinforcement learning agent in a suitable environment. The topic may range from any application introduced during the seminar and it is encouraged to choose the same topic as in the seminar. Robots are available, if you desire to make use of such. http://lego.wikia.com/wiki/8527_Mindstorms_NXT

The course lasts the whole spring alongside of the seminar. Similarly, the schedule is broken into two parts:

5.5. Deadline for a working environment and simple reinforcement learning framework in place. For example, if you chose to do a project with robots, the robot has been programmed to move and perceive the environment and understands states, actions and rewards. We will have representations available for your work this week, dates for those appear later.

9.5. Deadline for a complete implementation and report. For example, the robot now implements a simple model-based reinforcement learning agent and the report allows the instructor to reproduce and deploy the robot in 15 minutes so he can impress an audience (e.g., the report should be clear and informative, giving a practical overview on how the project works). 

 

The course is over, thanks for participating! The results are visible after 9.6. Give feedback to Joel if something popped up.

Kirjallisuus ja materiaali

This list will change as material gets revised and discovered. Stay tuned, and feel free to ask on specific topics from Joel Pyykkö!

Literature can be found on the seminar's pages. Further reading for basics can be found for example from scholarpedia: http://www.scholarpedia.org/article/Reinforcement_learning

Libraries and sources:

https://pypi.python.org/pypi/Reinforcement-Learning-Toolkit/1.0 -python

http://pybrain.org/ -python

http://acl.mit.edu/RLPy/ -python

http://web.mst.edu/~gosavia/mrrl_website.html -matlab