University of Helsinki Department of Computer Science
 

Department of Computer Science

Department information

 

582602 Natural Language Processing

(8 cp, 4 cu)

Lectures: 5 Sep 2006-11 Oct 2006, 31 Oct - 5 Dec 2006, Tue 10-12, Room B222

Exercises: 14 Sep - 13 Oct 2006, 2 Nov - 7 Dec 2006 Thu 12-14, Room C221

Results

The results are now here.

Goals

To provide students with the basic foundations in the field of Natural Language Processing and Computational Linguistics. The course will introduce the students to: After this course, the students should be able to:

Synopsis

Rule-based and statistical linguistic analysis:

Applications that combine several levels of analysis: information extraction.

Assignments/Exercises, Project work, no Exam.

Text: D. Jurafsky & J.H. Martin Speech and Language Processing [J&M]

Pre-requisites: Data Structures, Models of Programming and Computing. Basic programming skills, interest in language or text.
Basic familiarity with these topics: Finite state automata (FSA), regular expressions, regular languages (e.g., J&M, chapter 2)

Course Materials

Found here (requires local access). Contains:

Tentative Schedule:

Week 36: (2006.09.05)
Lecture: Introduction to NLP (RY)

Week 37: (2006.09.12)
Lecture: High-level Application: Text Understanding and Information Extraction (RY)
Exercise session: (2006.09.14)
  • Assignment 1: Manual annotation of facts in documents.
  • Tutorial: Annotation and Evaluation tools for IE

Week 38: (2006.09.19)
Lecture: Levels of analysis for IE (RY)
Exercise session: (2006.09.21)
  • Review Assignment 1 (part 1).
  • Tutorial on development environment for IE
  • Introduce Project: build/customize simple IE system

Week 39: (2006.09.26)
Lecture: Morphology. Transducers (RY)
Exercise session: (2006.09.28)
  • Assignment 1 due (part 2).
  • Assignment 2: finite state morphology.
  • Tutorial on FS morphology/PC-Kimmo
  • Introduce Project: Two-level Morphological analysis (non-English)

Week 40: (2006.10.03)
Lecture: Language modeling. N-Grams. Spelling correction (RY)
Exercise session: (2006.10.05)
  • Assignment 3: N-grams.
  • Introduce Project: N-grams and Spelling correction.

Week 41: (2006.10.10)
Lecture: Syntax. Parsing [J&M, chap. 11] (GL)
Exercise session: (2006.10.12)
  • Assignment 4: CFG and Parsing (short).

Week 42: (2006.10.17)
(No course meetings)
  • Assignment 2 due (deadline moved from week 40).
    Email solutions to teachers.

Week 43: (2006.10.24)
(No course meetings)

Week 44: (2006.10.31)
Lecture: Parsing. [J&M, chap. 12] (GL)
Exercise session: (2006.11.02)
  • Assignment 3: due.
  • Assignment 5: Parsing II
  • Introduce Project: implement simple Grammar for Parsing, tools.

Week 45: (2006.11.07)
Lecture: Parsing: Shallow Parsing/Chunking. [J&M, chap. 12] (GL)
Exercise session: (2006.11.09)
  • Assignment 4: due. (deadline extended to 16 Nov, see course wiki, task and Q&A)

Week 46: (2006.11.14)
Lecture: Part of speech tagging, HMMs (RY)
Exercise session: (2006.11.16)
  • Assignment 4: due.
  • Assignment 5: due.
  • Assignment 6:.
  • Introduce Project: Implement simple POS tagger.

Week 47: (2006.11.21)
Lecture: Lecture 10.a: HMMs/Algorithms (RY)
Exercise session: (2006.11.23)
  • Lecture continuation, 10.b: HMM Training(RY)

Week 48: (2006.11.28)
Lecture: 11.a: Word sense disambiguation: supervised methods (RY)
Exercise session: (2006.11.30)
  • Lecture continuation, 11.b: Unsupervised WSD, (Yarowsky)(RY)
  • Introduce Project: Word sense disambiguation.

Week 49: (2006.12.05)
Lecture: 12: Semantics: Distributional similarity (JP)
Exercise session: (2006.12.07)
  • Lecture 13: Automatic acquisition of semantic knowledge (RY)
  • Assignment 6: due.

Project work

During the course, there will be 5 or 6 suggested mini-projects, plus shorter exercises. Each student will be expected to do 3 of the mini-projects. (Each mini-project may require between 3 and 4 weeks of work.)

Grading

No exam. Students are graded based on their project work and their completed exercises

Registration

Register through the department registration system from 24 August 2006.

Contact

Department of Computer Science                   Street address:
P.O. Box 68                          Exactum Building, Room A223
FIN-00014 University of Helsinki      Gustaf Hällströmin katu 2B
Finland
Roman Yangarber

Greger Lindén

Last update: Friday, 13-Apr-2007 20:44:55 EEST
(Page layout < O. Heinonen < M. Raento < G. Lindén)