Information extraction from text, Week 6



The solutions should be ready for inspection by Friday 28.3.2002 (midnight).

Remember that always, if you are in doubt what you should do, you can ask Lili or send a message to our newsgroup!!


  1. Describe the wrapper generation algorithm (section 4.1.) of the RoadRunner method using the two HTML pages which you find here.

    The article: Crescenzi, Mecca, Merialdo: RoadRunner: Towards automatic data extraction from large web sites, Proceedings of the 27th VLDB conference.

  2. Write a short summary of the phases of an IE process, both in case of



Helena Ahonen-Myka
Last modified: Tue Mar 18 19:28:00 EET 2003