Information extraction from text, Week 3

The solutions should be ready for inspection by Thursday 27.2.2002 (midnight).

Remember that always, if you are in doubt what you should do, you can ask Lili or send a message to our newsgroup!!

Describe how the AutoSlog system works using our sample terrorist data and the corresponding answer key templates.

Give (at least) 5 concept node definitions. You can present the definitions informally (not necessarily using the Lisp-style of the paper). Explain what your definitions mean (in such a way that also we can see that you have understood them...)

the AutoSlog article: E. Riloff:Automatically Constructing a Dictionary for Information Extraction Tasks, Proceedings of the 11th National Conference on Artificial Intelligence, 1993, p. 811-816.

In this exercise, we study the algorithm of AutoSlog-TS. Assume, our text collection contains the following documents:


text 1; relevant

s: A group of terrorists
v: attacked
do: a post
pp: in Nuevo Progreso.


text 2; relevant

s: The National offices
v: were attacked
time: today.
s: Unidentified individuals
v: detonated
do: a bomb.
s: The bomb
v: destroyed
do: a car.


text 3; not relevant

s: The Armed Forces units
v: killed
do: one rebel.
s: They
v: destroyed
do: an underground hideout.


text 4; relevant

s: Unidentified individuals
v: attacked
do: a high tension tower.
s: They
v: destroyed
do: it.


text 5; not relevant

s: The coca growers
v: protest
do: the destruction of their fields.
s: The strike
v: is supported
pp: by the Shining Path.

Explain the process of AutoSlog-TS using these documents and give the ranking for the extraction patterns that are generated.

Abbreviations: s subject, v verb, do direct object, pp preposition phrase

More information: Riloff, E. "Automatically Generating Extraction Patterns from Untagged Text" Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96) , 1996, pp. 1044-1049.

Helena Ahonen-Myka

Last modified: Wed Feb 19 11:44:47 EET 2003