[an error occurred while processing this directive]
Department of Computer Science University of Helsinki

Publications

2008

  1. Miro Lehtonen and Antoine Doucet.
    XML-Aided Phrase Indexing for Hypertext Documents. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. To appear.
  2. Miro Lehtonen and Antoine Doucet.
    Phrase detection in the Wikipedia. Lecture Notes in Computer Science, 6th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2007. To appear.

2007

  1. Miro Lehtonen.
    Vocabulary-Independent Methods for XML Information Retrieval.
    In proceedings of AMICT 2006, Advances in Methods of Information and Communication Technology. Petrozavodsk State University, 2007. p. 53-61.
  2. Antoine Doucet and Miro Lehtonen.
    Unsupervised classification of text-centric XML document collections.
    Lecture Notes in Computer Science, Comparative Evaluation of XML Information Retrieval Systems, 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006. Volume 4518 / 2007. p. 515-527.
  3. Miro Lehtonen and Antoine Doucet
    EXTIRP: Baseline Retrieval from Wikipedia.
    Lecture Notes in Computer Science, Comparative Evaluation of XML Information Retrieval Systems, 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006. Volume 4518 / 2007. p. 119-124.
  4. Miro Lehtonen, Nils Pharo, Andrew Trotman.
    A Taxonomy for XML Retrieval Use Cases.
    Lecture Notes in Computer Science, Comparative Evaluation of XML Information Retrieval Systems, 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006. Volume 4518 / 2007. p. 430-439
  5. Andrew Trotman, Nils Pharo, Miro Lehtonen.
    XML-IR Users and Use Cases.
    Lecture Notes in Computer Science, Comparative Evaluation of XML Information Retrieval Systems, 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006. Volume 4518 / 2007. p. 416-429.

2006

  1. Antoine Doucet and Helena Ahonen-Myka.
    Probability and Expected Document Frequency of Discontinued Word Sequences, an efficient method for their exact computation.
    To appear in the TAL journal, special issue on "Scaling of Natural Language Processing: Complexity, Algorithms and Architectures, 46 (2): 25 pages, 2006. [ BibTex ]
  2. Miro Lehtonen.
    Designing User Studies for XML Retrieval.
    In proceedings of the SIGIR 2006 Workshop on XML Element Retrieval Methodology, Seattle, USA, 10 August 2006, pages 28-34. [pdf] [BibTex]
  3. Miro Lehtonen.
    Indexing Heterogeneous XML for Full-Text Search.
    Ph.D. thesis, University of Helsinki. Helsinki University Printing House, 183+3 pages, November 2006. [ e-thesis ]
  4. Miro Lehtonen.
    Preparing Heterogeneous XML for Full-Text Search.
    ACM Transactions on Information Systems (TOIS), Special Issue on XML Retrieval, 24, 4, pages 455-474. ACM Press, October 2006. [pdf] [BibTex]
  5. Miro Lehtonen.
    When a Few Highly Relevant Answers Are Enough
    Lecture Notes in Computer Science, Advances in XML Information Retrieval and Evaluation: 4th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2005. Volume 3977 / 2006. p. 296-305. [ pdf ]

2005

  1. Antoine Doucet.
    Advanced Document Description, a Sequential Approach.
    Ph.D. thesis, University of Helsinki. Helsinki University Printing House, 161 pages, November 2005. [ BibTex ]
  2. Antoine Doucet and Helena Ahonen-Myka.
    A Method to Calculate Probability and Expected Document Frequency of Discontinued Word Sequences.
    In proceedings of ACM SIGIR 2005, ELECTRA Workshop on Methodologies and Evaluation of Lexical Cohesion Techniques in Real-world Applications (Beyond Bag of Words), Salvador, Brazil, August 15-19, 2005, p. 33-40. [ pdf ] [ BibTex ]
  3. Helena Ahonen-Myka and Antoine Doucet.
    Data Mining Meets Collocations Discovery.
    In Inquiries into Words, Constraints and Contexts, Festschrift for Kimmo Koskenniemi (CSLI Studies in Computational Linguistics), p. 194--203, 2005. [ pdf ] [ BibTex ]
  4. Miro Lehtonen.
    EXTIRP 2004: Towards Heterogeneity.
    Lecture Notes in Computer Science, Advances in XML Information Retrieval and Evaluation: 4th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2005. Volume 3977 / 2006. p. 296-305. [ pdf ]

2004

  1. Lili Aunimo, Reeta Kuuskoski and Juha Makkonen.
    Cross-Language Question Answering at the University of Helsinki.
    In Proceedings of the CLEF 2004 Workshop, September 2004, Bath, United Kingdom. pp. 361-370.
  2. Lili Aunimo, Juha Makkonen and Reeta Kuuskoski.
    Cross-Language Question Answering for Finnish. In Proceedings of the Web Intelligence Symposium, held at the Finnish Artificial Intelligence Conference, September 2004, Vantaa, Finland. pp. 35-49.
  3. Antoine Doucet, Helena Ahonen-Myka, Non-Contiguous Word Sequences for Information Retrieval. In Proceedings of the 42nd annual meeting of the Association for Computational Linguistics (ACL-2004), Workshop on Multiword Expressions: Integrating Processing, Barcelona, Spain, July 21-26, 2004, pp. 88--95. [ pdf ] [ BibTex ]
  4. Antoine Doucet, Utilisation de Séquences Fréquentes Maximales en Recherche d'Information. In Proceedings of the 7th International Conference on the Statistical Analysis of Textual Data (JADT 2004), Louvain-la-Neuve, Belgium, March 10-12, 2004, pp. 334--345. [ pdf ] [ BibTex ]
  5. Antoine Doucet, Lili Aunimo, Miro Lehtonen, Renaud Petit, Accurate Retrieval of XML Document Fragments using EXTIRP. In Proceedings of the Second Annual Workshop of the Initiative for the Evaluation of XML retrieval (INEX), Schloss Dagstuhl, Germany, December 15-17, 2003, To appear in ERCIM Workshop Proceedings, 2004. [ pdf ] [ BibTex ]
  6. Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Simple Semantics in Topic Detection and Tracking. Information Retrieval, 7 (3-4): 347--368, 2004.

2003

  1. Juha Makkonen, Helena Ahonen-Myka, Utilizing Temporal Expressions in Topic Detection and Tracking. In Proceedings of 7th European Conference on Research and Advanced Technology for Digital Libraries (ECDL03), August 2003, Trondheim, Norway, pp. 393--404.
  2. Juha Makkonen, Investigations on Event Evolution in TDT. In Proceedings of HLT-NAACL 2003 Student Workshop, May 2003, Edmonton, Canada, pp. 43--48.
  3. Juha Makkonen, Helena Ahonen-Myka, Extraction of Temporal Expressions from Finnish Newsfeed. To appear in Proceedings of 14th Nordic Conference of Computational Linguistics (NoDaLiDa 2003), May 2003 Reykjavik, Iceland. [ ps.gz ]
  4. Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Topic Detection and Tracking with Spatio-temporal Evidence. In Proceedings of 25th European Conference on Information Retrieval Research (ECIR 2003), April 2003, Pisa, Italy, pp. 251--265. [ ps.gz ]
  5. Lili Aunimo, Oskari Heinonen, Reeta Kuuskoski, Juha Makkonen, Renaud Petit, Otso Virtanen, Question Answering System for Incomplete and Noisy Data: Methods and Measures for its Evaluation. In Proceedings of 25th European Conference on Information Retrieval Research (ECIR 2003), April 2003, Pisa, Italy, pp. 193--206.
    [ pdf ]

2002

  1. Miro Lehtonen. Utilizing a Multipurpose Collection of Documents. In Proceedings of the Finnish Data Processing Week (FDPW'02), pages 138-146.
  2. Antoine Doucet and Helena Ahonen-Myka, Naive clustering of a large XML document collection. In Proceedings of the First Annual Workshop of the Initiative for the Evaluation of XML retrieval (INEX), Schloss Dagstuhl, Germany, December 9-11, 2002, ERCIM Workshop Proceedings, March 2003, pp. 81--88. [ pdf ] [ BibTex ]
  3. Juha Makkonen, Helena Ahonen-Myka, Marko Salmenkivi, Applying Semantic Classes in Event Detection and Tracking. In Proceedings of International Conference on Natural Language Processing (ICON 2002), December 2002, Mumbai, India, pp. 175--183.
    [ ps.gz ]
  4. Marko Salmenkivi, Juha Makkonen, Helena Ahonen-Myka, Topic Detection and Tracking based on Extracting Words with Meaning of the Same Type. In Proceedings of 10th Finnish Artificial Intelligence Conference (STeP 2002), December 2002, Oulu, Finland, pp. 19--30.
  5. Miro Lehtonen, Renaud Petit, Oskari Heinonen, Greger Lindén, A Dynamic User Interface for Document Assembly. Proceedings of the ACM Symposium on Document Engineering, 8-9 November, 2002, 134-141.
  6. Helena Ahonen-Myka. Discovery of frequent word sequences in text. The ESF Exploratory Workshop on Pattern Detection and Discovery in Data Mining, Imperial College, London, 16-19 September, 2002.
  7. Martin Fluch, Greger Lindén and Andrei Popescu, A Journalist's Tool Using XML for Writing and Retrieving News Stories. In E. Hyvönen and M. Klemettinen (eds.) Towards the Semantic Web and Web Services - Proceedings of the XML Finland 2002 Conference, HIIT Publications, 2002-03, Helsinki Institute for Information Technology (HIIT), Helsinki, Finland 2002, 96-108.
  8. Antoine Doucet, Extracting More Relevant Document Descriptors using Hierarchical Information. In Proceedings of XML Finland 2002, October 21-22, p. 136-147. [ pdf ] [ BibTex ]
  9. Antoine Doucet, Améliorer les descripteurs de documents semi-structurés en utilisant les informations contextuelles. INFORSID 2002, Forum Jeunes Chercheurs, Nantes, France, June 4-7, 2002, p. 401-402. [ ps ] [ BibTex ]
  10. Greger Lindén, Proaktiivinen tietotekniikka: Tietääkö kone, mitä haluat seuraavaksi? Tietoyhteys 3/2002, pp. 12-13.

2001

  1. Juha Makkonen, News-feed categorization. In Proceedings of Finnish Data Processing Week (FDPW 2001-02), Vol. 4, pp. 78--89.
  2. Juha Makkonen and Jussi Piitulainen, Expanding document vectors in text categorization. In Proceedings of Infotech Oulu International Workshop on Information Retrieval (IR2001), Oulu, Finland, September 2001, 53-60.
  3. Miro Lehtonen. Document Assembly with XML Structured Source Data. In Proceedings of XML Finland 2001, pages 52-60, November 2001.

2000

  1. Helena Ahonen-Myka, Barbara Heikkinen, Oskari Heinonen, and Mika Klemettinen. Printing Structured Text without Stylesheets [PS]. In XML Scandinavia 2000, May 2-4, Gothenburg, Sweden, 2000.

-1999

  1. Previous publications (DocMan)
Last modified: Wednesday, 02-Apr-2008 22:56:47 EEST
Validate XHTMLValidate CSS