Matti KääriäinenI have moved to NRC Ruoholahti (new homepage). I continue to serve the University of Helsinki as an Adjunct Professor in Computer Science. University email: matti.kaariainen at cs.helsinki.fi. |
|
Sinuhe:
I've developed a machine translation system called Sinuhe. For details, see the paper and the source release below.
- Sinuhe -- Statistical Machine Translation using a Globally Trained Conditional Exponential Family Translation Model, EMNLP'09 (Aug 2009).[slides]
- Latest source release (GPLv3)
- Pre-trained models for some European languages
- Old releases and models
Publications:
- An analysis of reduced error pruning, Journal of Artificial Intelligence Research 15 (Sept. 2001) 163-187. (With T. Elomaa)
- On the practice of branching program boosting. In L. De Raedt & P. Flach (eds.), Machine Learning: ECML 2001, Proc. Twelfth European Conference (pp. 133-144). Lecture Notes in Artificial Intelligence 2167. Springer , 2001. (With T. Elomaa)
-
The difficulty of reduced error pruning of leveled branching
programs, AI&M 7-2002,
Seventh International Symposium on Artificial Intelligence and
Mathematics
(With T. Elomaa)
Extended journal version in Annals of Mathematics and Artificial Intelligence 41 (1), pages 111-124, May 2004. - Progressive Rademacher sampling. Proc. 18th National Conference on Artificial Intelligence, AAAI-2002 (pp. 140-145). AAAI Press & MIT Press, 2002. (With T. Elomaa)
- Reduced Error Pruning of Branching Programs Cannot Be Approximated to within a Logarithmic Factor, Information Processing Letters 87, 2 (2003) 73-78. (With R. Nock and T. Elomaa)
- Rademacher penalization over decision tree prunings. In N. Lavrac, D. Gamberger, H. Blockeel & L. Todorovski (eds.), Machine Learning: ECML 2003, Proc. 14th European Conf. (pp. 193-204). LNAI 2837. Springer, 2003. With T. Elomaa.
- Relating the Rademacher and VC bounds. Department of Computer Science, Series of Publications C, Report C-2004-57, 2004.
- Selective Rademacher penalization and reduced error pruning of decision trees. Journal of Machine Learning Research, volume 5: 1107-1126, 2004. With T. Malinen and T. Elomaa.
- Generalization error bounds using unlabeled data, In Learning Theory: 18th Annual Conference on Learning Theory, COLT '05 (pp. 127-142).
- A comparison of tight generalization error bounds, In The 22nd International Conference on Machine Learning (ICML 2005). With John Langford. For software, look here.
- Semi-Supervised Model Selection Based on Cross-Validation, Special Session on Model Selection, IJCNN'06. Previously appeared as a Technical Report TR-05-010 at the International Computer Science Institute (ICSI), 2005.
- Active Learning in the Non-realizable Case, to appear at ALT, 2006.
Workshop presentations:
- Using Unlabeled Data in Generalization Error Bounds, (Ab)use of Bounds workshop, NIPS 2004, Whistler, Canada.
- On active learning in the non-realizable case, oral presentation at the Foundations of Active Learning workshop at NIPS, Dec 2005.
- Active learning under arbitrary distributions, poster at the Value of Information in Inference, Learning, and Decision-Making workshop at NIPS, Dec 2005. (Joint work with Claire Monteleoni)
- Lower bounds for reductions, oral presentation at the Atomic Learning workshop at TTI-C, March 2006. (Joint work with John Langford)
- Learning (and reliability), oral presentation at the Theory of Networked computation workshop, ICSI, Berkeley, March 2006.
Theses:
- M.Sc. thesis (in Finnish): Koneoppimismenetelmien yleistysvirheen data- ja algoritmiriippuvainen analyysi (Data and algorithm dependent generalization error analysis of machine learning methods), Department of Computer Science, University of Helsinki, Report C-2002-40, February 2002. (Awarded a Pro Gradu prize by the Faculty of science)
- Ph.D. thesis: Learning Small Trees and Graphs that Generalize, Department of Computer Science, University of Helsinki, Report A-2004-7, September 2004.
Other:
- Foundations of Active Learning workshop at NIPS 2005.

