Statistical Machine Translation

582375
3
Algoritmit ja koneoppiminen
Syventävät opinnot
Vuosi Lukukausi Päivämäärä Periodi Kieli Vastuuhenkilö
2013 syksy 18.11-19.12. 2-2 Englanti Roman Yangarber

Those wishing to attend the course are asked to contact Atro Voutilainen (first.last at helsinki dot fi).

Yleistä

STATISTICAL MACHINE TRANSLATION: COURSE AND LECTURE

Invited by the BAULT (Building and use of language technology) consortium, Dr. Christer Samuelsson (DFKI, Germany) gives a guest lecture (details later) and course on SMT Nov. 18 - Dec. 19: "Let's fake an SMT using Unix, HMMs, and GAs".

The course consists of 16 90-minute sessions (lectures, demonstrations, programming exercises) and is intended for students and researchers with (Unix) programming capabilities and an interest in Language Technology and Machine Translation. The sessions take place in the Center Campus:
- Mon 16-18 (18.11., 25.11., 2.12., 9.12., 16.12.) room A112 (Metsätalo)
- Tue 14-16 (19.11.) room A112 (Metsätalo)
- Wed 16-18 (20.11., 27.11., 4.12., 11.12., 18.12.) hall 29 (Metsätalo)
- Thu 12-14 (21.11., 28.11., 5.12., 12.12., 19.12.) P344 (Porthania)

Those wishing to attend the course are asked to contact Atro Voutilainen (first.last at helsinki dot fi). Use of the course (3 credits) as part of studies is negotiable. Grading is based on evaluation of student assignments.

Tentative course schedule:
"Let's fake an SMT using Unix, HMMs, and GAs."
Christer Samuelsson

18/11: Course overview. Word n-gram models. Variable length n-gram models. K&S.
19/11: HMM-based PoS tagging. K&S.
20/11: K-means clustering and the EM algorithm. IBM model 2 and BLEU scores. Bishop.
21/11: Faking a simple SMT. Handouts. Assignment 1 out.
------
25,27, and 28/11: Students implement SMT assisted by my stunt double.
------
2/12: Student presentations Assignment 1.
4/12: Genetic Algos I. Wahde.
5/12: Genetic Algos II. Wahde.
------
9/12: Genetic Algos and SMT: word order and word insertions. Wahde/Handouts. Assignment 2 out.
11/12: Ant Colony methods. Wahde.
12/12: Particle Swarm methods. Wahde.
------
16 and 18/12: Students add GA to handle word order and word insertions assisted by me.
19/12: Student presentations Assignment 2.
------
Course material (will be provided):
K&S: Krenn & Samuelsson "The Linguists Guide to Statistics"
Bishop: Christopher Bishop "Machine Learning and Pattern Recognition"
Wahde: "Biologically Inspired Optimization Methods"