Seminar on Reinforcement Learning and Information Retrieval
Vuosi | Lukukausi | Päivämäärä | Periodi | Kieli | Vastuuhenkilö |
---|---|---|---|---|---|
2017 | kevät | 18.01-03.05. | 3-4 | Englanti | Dorota Glowacka |
Luennot
Aika | Huone | Luennoija | Päivämäärä |
---|---|---|---|
Ke 10-12 | C220 | Dorota Glowacka | 18.01.2017-01.03.2017 |
Ke 10-12 | C220 | Dorota Glowacka | 15.03.2017-03.05.2017 |
Yleistä
Kurssin suorittaminen
Possible changes to schedule will appear on the course pages.
18.1. - 29.1. Introductory lectures
30.1. Deadline for topic selection - Send in the topic you wish to write about, and if it's not on the list at least 2 papers that outline the topic you wish to write about. Do this via email by 23:59.
Send it to both of our emails: firstname.lastname at cs.helsinki.fi.
15.2. Presentation of the chosen topic, 5 minutes, ~5 slides.
22.2. Lecture on writing, finding references and presentation.
You're on your own! We will send emails to you later to ask for your current versions on the essay. We will give you feedback via email for them. Same will happen for the presentation slides. Contact us via email if you have any questions, and look at the slides below for instructions on writing and presentin. Happy writing!
29.3. Feedback session (send emails if you wish to meet this day).
5.4. Feedback session (send emails if you wish to meet this day).
12.4. Final presentations, part 1. 20 minutes, ~20 slides.
25.4. Final presentations, part 2. 20 minutes, ~20 slides. (note, the original 19th was during the uni's easter break)
26.4. Final presentations, part 3. 20 minutes, ~20 slides.
3.5 Deadline for the final paper submission.
Kirjallisuus ja materiaali
The slides:
http://www.cs.helsinki.fi/u/jgpyykko/RL-slides.pdf
Writing/presenting instructions: https://docs.google.com/presentation/d/1cDQnmW-RbgGzUW1UMzDXvT_8npFLfGsZN7snz3qquuE/edit?usp=sharing
The book:
http://people.inf.elte.hu/lorincz/Files/RL_2006/SuttonBook.pdf
Writing platforms
Conferences worth noting:
NIPS https://nips.cc/
CIKM http://www.cikmconference.org/
SIGIR http://sigir.org/
RecSys https://recsys.acm.org/
And check earlier years too.
Here are some available topics:
1. Reinforcement learning in music retrieval
X. Wang, Y. Wang, D. Hsu, and Y. Wang. Exploration in interactive personalized music recommendation: A reinforcement learning approach.
ACM Trans. Multimedia Comput. Commun. Appl., 11(1):7:1–7:22, Sept.2014.
2. Markov decision processes in document and web page recommendation
S. Zhang, J. Luo, and H. Yang. A pomdp model for content-free document re-ranking. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR’14, pages 1139–1142, New York, NY, USA, 2014. ACM.
3. Application of Q-learning in information retrieval
B.-T. Zhang and Y.-W. Seo. Personalized web-document filtering using reinforcement learning.
Applied Artificial Intelligence, 15(7):665–685, 2001.
4. Application of bandit algorithms to ranker evaluation
M. Zoghi, S. Whiteson, and M. de Rijke. Mergerucb: A method for large-scale online ranker evaluation. In Proceedings of the Eighth ACM
International Conference on Web Search and Data Mining, WSDM ’15, pages 17–26, New York, NY, USA, 2015. ACM.
B. Brost, Y. Seldin, I. J. Cox, and C. Lioma. Multi-dueling bandits and their application to online ranker evaluation. In Proceedings of the 25th
ACM International on Conference on Information and Knowledge Management, pages 2161–2166. ACM, 2016.
5. Reinforcement learning in image retrieval
A reinforcement learning approach to query-less image retrieval
S Hore, L Tyrvainen, J Pyykko, D Glowacka - International Workshop on Symbiotic Interaction, 2014
P.-Y. Yin, B. Bhanu, K.-C. Chang, and A. Dong. Integrating relevance feedback techniques for image retrieval using reinforcement learning.
IEEE Trans. Pattern Anal. Mach. Intell., 27(10):1536–1551, Oct. 2005.
6. Balancing exploration and exploitation in information retrieval
Balancing exploration and exploitation: Empirical parameterization of exploratory search systems
K Ahukorala, A Medlar, K Ilves, D Glowacka
Proceedings of the 24th ACM International on Conference on Information and ...
K. Hofmann, S. Whiteson, and M. de Rijke. Balancing exploration and exploitation in learning to rank online. In European Conference on Information Retrieval, pages 251–263. Springer, 2011.
7. Online ranking with bandit algorithms
F. Radlinski, R. Kleinberg, and T. Joachims. Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on Machine learning, pages 784–791. ACM, 2008.
K. Hofmann, S. Whiteson, and M. de Rijke. Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval. Information Retrieval, 16(1):63–90, 2013.
8. Reinforcement learning for news and ad recommendation
L. Li, W. Chu, J. Langford, and R. E. Schapire. A contextual-bandit approach to personalized news article recommendation. In Proceedings of
the 19th International Conference on World Wide Web, WWW ’10, pages 661–670, New York, NY, USA, 2010. ACM.
S. Yuan and J. Wang. Sequential selection of correlated ads by pomdps. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 515–524. ACM, 2012.
9. Reinforcement learning in recommender systems
N. Taghipour, A. Kardan, and S. S. Ghidary. Usage-based web recommendations: A reinforcement learning approach. In Proceedings of the 2007
ACM Conference on Recommender Systems, RecSys ’07, pages 113–120, New York, NY, USA, 2007. ACM.
P. Kohli, M. Salek, and G. Stoddard. A fast bandit algorithm for recommendation to users with heterogenous tastes. In AAAI, 2013