University homepage Suomenkielinen versio puuttuu Inte på svenska In english
University of Helsinki Department of Computer Science
 

Department of Computer Science

Roman Yangarber: Publications

  1. Techniques for Multilingual Security-related Event Extraction from Online News   
    Martin Atkinson, Jakub Piskorski, Hristo Tanev, Roman Yangarber, Vanni Zavarella.
    In Computational Linguistics—Applications (A. Przepiórkowski, M. Piasecki, K. Jassem, P. Fuglewicz, eds.) Springer Verlag, Studies in Computational Intelligence (2012) To appear.
  2. Using Context and Phonetic Features in Models of Etymological Sound Change   
    Hannes Wettig, Kirill Reshetnikov and Roman Yangarber.
    In EACL 2012: Workshop on Visualization of Linguistic Patterns and Uncovering Language History from Multilingual Resources (2012) Avignon, France
  3. Predicting Relevance of Event Extraction for the End User   
    Silja Huttunen, Arto Vihavainen, Mian Du, Roman Yangarber.
    In "Multi-source, Multilingual Information Extraction and Summarization", Theory and Applications of Natural Language Processing, T. Poibeau et al. (eds.). Springer-Verlag (2012) Berlin, Heidelberg
  4. MDL-based modeling of etymological sound change in the Uralic language family   
    Hannes Wettig, Suvi Hiltunen, Roman Yangarber.
    WITMSE-2011: The Fourth Workshop on Information Theoretic Methods in Science and Engineering (2011) Helsinki, Finland
  5. Building support tools for Russian-language information extraction   
    Mian Du, Peter von Etter, Mikhail Kopotev, Mikhail Novikov, Natalia Tarbeeva, Roman Yangarber.
    BSNLP-2011: Balto-Slavonic Natural Language Processing (2011) Plzeň, Czech Republic
  6. Multilingual real-time event extraction for border security intelligence gathering   
    Martin Atkinson, Jakub Piskorski, Erik Van der Goot, Roman Yangarber
    Counterterrorism and Open Source Intelligence. Springer Lecture Notes in Social Networks, Vol. 2. (Uffe Kock Wiil, editor). (2011) pp. 355-390
  7. MDL-based models for aligning etymological data   
    Hannes Wettig, Suvi Hiltunen, Roman Yangarber.
    RANLP-2011: Conference on Recent Advances in Natural Language Processing (2011) Hissar, Bulgaria
  8. Relevance prediction in information extraction using discourse and lexical features
    Silja Huttunen, Arto Vihavainen, Peter von Etter, Roman Yangarber.
    Nodalida-2011: Nordic Conference on Computational Linguistics (2011) Riga, Latvia
  9. Probabilistic models for alignment of etymological data   
    Hannes Wettig, Roman Yangarber.
    Nodalida-2011: Nordic Conference on Computational Linguistics (2011) Riga, Latvia
  10. Hidden Markov models for induction of morphological structure of natural language   
    Hannes Wettig, Suvi Hiltunen, Roman Yangarber.
    WITMSE-2010: Workshop on Information Theoretic Methods in Science and Engineering (2010) Tampere, Finland
  11. Assessment of utility in Web mining for the domain of Public Health    (pdf)
    Peter von Etter, Silja Huttunen, Arto Vihavainen, Matti Vuorinen, Roman Yangarber.
    In Proceedings of LOUHI-2010: the Second Louhi Workshop on Text and Data Mining of Health Documents, at the NAACL/HLT Conference, (2010) Los Angeles, California
  12. MedISys—Medical Information System   
    Jens P. Linge, Ralf Steinberger, Flavio Fuart, Stefano Bucci, Jenya Belyaeva, Monica Gemo, Delilah Al-Khudhairy, Roman Yangarber, Erik van der Goot.
    In Advanced ICTs for Disaster Management and Threat Detection: Collaborative and Distributed Frameworks. Eleana Asimakopoulou, Nik Bessis (eds.), (2010) IGI GLobal Press, pp. 131-142.
  13. Real-time text mining in multilingual news for the Creation of a Pre-frontier Intelligence Picture    (pdf)
    Jakub Piskorski, Martin Atkinson, Jenya Belyaeva, Vanni Zavarella, Silja Huttunen, Roman Yangarber.
    In Proceedings of the 16th Conference on Knowledge Discovery and Data Mining (KDD-2010); ACM SIGKDD Workshop on Intelligence and Security Informatics. (2010) Washington, DC
  14. Filtering news for epidemic surveillance: towards processing more languages with fewer resources   
    Gael Lejeune, Antoine Doucet, Roman Yangarber, Nadine Lucas.
    CLIA: Fourth International Workshop On Cross Lingual Information Access, at COLING 2010 (2010) Beijing, China
  15. Utility evaluation of tools for collaborative development and maintenance of ontologies   
    Alex Norta, Roman Yangarber, Lauri Carlson.
    VORTE-2010: Joint 5th International Workshop on Vocabularies, Ontologies and Rules for The Enterprise / International Workshop on Metamodels, Ontologies and Semantic Technologies (MOST) at EDOC-2010: the Fourteenth IEEE International Conference On Enterprise Computing (2010) Vitória, ES, Brazil
  16. News mining for border security intelligence    (pdf)
    Jakub Piskorski, Martin Atkinson, Jenya Belayeva, Vanni Zavarella Silja Huttunen, Roman Yangarber.
    In IEEE ISI-2010: Intelligence and Security Informatics (2010) Vancouver, BC, Canada
  17. The landscape of international event-based biosurveillance    (link)
    D Hartley, N Nelson, R Walters, R Arthur, R Yangarber, L Madoff, J Linge, A Mawudeku, N Collier, J Brownstein, G Thinus, N Lightfoot.
    In Emerging Health Threats Journal, 3:e3 (2010)
  18. (2009) Venice, Italy -->

  19. Automated event extraction in the domain of Border Security    (pdf)
    Martin Atkinson, Jakub Piskorski, Hristo Tanev, Eric van der Goot, Roman Yangarber, Vanni Zavarella.
    In Proceedings of MINUCS-2009: Workshop on Mining User-Generated Content for Security, at the UCMedia-2009: ICST Conference on User-Centric Media (2009) Venice, Italy
  20. Automatic epidemiological surveillance from on-line news in MedISys and PULS    (pdf)
    Roman Yangarber, Peter von Etter, Ralf Steinberger.
    In Proceedings of IMED-2009: International Meeting on Emerging Diseases and Surveillance (2009) Vienna, Austria
  21. Internet surveillance systems for early alerting of health threats    (link pdf)
    Jens P. Linge, Ralf Steinberger, Thomas P. Weber, Roman Yangarber, Erik van der Goot, Delilah H. Al-Khudhairy, Nikolaos I. Stilianakis
    In Eurosurveillance Journal, 14(13) (2009) Stockholm, Sweden
  22. Text mining from the Web for Medical Intelligence    (pdf)
    Ralf Steinberger, Flavio Fuart, Erik van der Groot, Clive Best,
    Peter von Etter, Roman Yangarber.
    In: Mining Massive Data Sets for Security, D. Perrotta, J. Piskorski, F. Soulié-Fogelman & R. Steinberger (eds.): OIS Press. (2008) Amsterdam, The Netherlands
  23. Content Collection and Analysis in the Domain of Epidemiology    (pdf)
    Roman Yangarber, Peter von Etter, Ralf Steinberger.
    In Proceedings of DrMED-2008: International Workshop on Describing Medical Web Resources, at MIE-2008: the 21st International Congress of the European Federation for Medical Informatics (2008) Göteborg, Sweden
  24. A Database of the Uralic Language Family for Etymological Research    (pdf)
    Roman Yangarber, Marko Salmenkivi, Marjaana Välisalo.
    Technical Report C-2008-38. University of Helsinki, Department of Computer Science, Series of Publications C (2008)
  25. Combining information retrieval and information extraction for medical intelligence    (pdf)
    Roman Yangarber, Ralf Steinberger, Clive Best, Peter von Etter, Flavio Fuart, David Horby.
    Mining Massive Data Sets for Security, NATO Advanced Study Institute (2007) Gazzada, Italy
  26. Combining Information about Epidemic Threats from Multiple Sources    (pdf)
    Roman Yangarber, Clive Best, Peter von Etter, Flavio Fuart, David Horby, Ralf Steinberger.
    In Proceedings Multi-source, Multilingual Information Extraction and Summarization at RANLP-2007. (2007) Borovets, Bulgaria
  27. Verification of Facts across Document Boundaries    (pdf)
    Roman Yangarber.
    In Proceedings IIIA-2006: International Workshop on Intelligent Information Access (2006) Helsinki, Finland
  28. Mining the Semantics of Text via Counter-Training    (link)
    Roman Yangarber.
    In Proceedings of the 12th Portuguese Conference on Artificial Intelligence, EPIA-2005, Thematic area: Text Mining and Applications TEMA-2005
    Springer LNCS Vol. 3808, pp. 647-657 (2005) Covilhã, Portugal
  29. Redundancy-based Correction of Automatically Extracted Facts    (pdf)
    Roman Yangarber, Lauri Jokipii.
    In Proceedings Human Language Technology Conference/ Conference on Empirical Methods in Natural Language Processing: HLT/EMNLP-2005, (2005) Vancouver, Canada
  30. Information Extraction from Epidemiological Reports    (pdf)
    Roman Yangarber, Lauri Jokipii, Antti Rauramo, Silja Huttunen.
    In Proceedings Human Language Technology Conference/ Conference on Empirical Methods in Natural Language Processing: HLT/EMNLP-2005, demonstration; (2005) Vancouver, Canada
  31. Use of Deep Syntax Parsing in Cross-Language Information Extraction   
    Konstantin Bogatyrev, Roman Yangarber.
    In Proceedings Workshop on Intelligent Linguistic Technologies, International Conference on Machine Learning; Models, Technologies and Applications MLMTA-2005, pp. 18-24 (2005) Las Vegas, NV
  32. User-Oriented Evaluation in Information Extraction   
    Roman Yangarber.
    In Proceedings Workshop on User-Oriented Evaluation of Knowledge Discovery Systems, 4th International Conference on Language Resources and Evaluation (LREC 2004) Lisbon, Portugal
  33. Information Extraction for Enhanced Access to Disease Outbreak Reports    (link)
    Ralph Grishman, Silja Huttunen, Roman Yangarber.
    In Journal of Biomedical Informatics, 35 (4) pp. 236-246, C. Friedman, ed. (2003)
  34. Acquisition of Domain Knowledge    (link)
    Roman Yangarber.
    Invited chapter In Extraction in the Web Era (M.T. Pazienza, ed.), Lecture Notes in Computer Science, Vol. 2700 Springer-Verlag Heidelberg, pp. 1-28 (2003) Rome, Italy
  35. Bootstrapped Learning of Semantic Classes from Positive and Negative Examples    (pdf)
    Winston Lin, Roman Yangarber, Ralph Grishman.
    In Proceedings of the 20th International Conference on Machine Learning: ICML 2003 Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (2003) Washington, D.C.
  36. Counter-Training in Discovery of Semantic Patterns    (pdf)
    Roman Yangarber.
    In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics: ACL-2003 (2003) Sapporo, Japan
  37. Unsupervised Learning of Generalized Names    (ps.gz, pdf)
    Roman Yangarber, Winston Lin, Ralph Grishman.
    In Proceedings of the 19th International Conference on Computational Linguistics: COLING-2002 (2002) Taipei, Taiwan
  38. Complexity of Event Structure in IE Scenarios    (ps, pdf)
    Silja Huttunen, Roman Yangarber, Ralph Grishman.
    In Proceedings of the 19th International Conference on Computational Linguistics: COLING-2002 (2002) Taipei, Taiwan
  39. Real-Time Event Extraction for Infectious Disease Outbreaks    (pdf)
    Ralph Grishman, Silja Huttunen, Roman Yangarber.
    In Proceedings of the 3rd Annual Human Language Technology Conference HLT-2002 (2002) San Diego, CA
  40. Diversity of Scenarios in Information Extraction    (ps)
    Silja Huttunen, Roman Yangarber, Ralph Grishman.
    In Proceedings of the 3rd International Conference on Language Resources and Evaluation LREC-2002 (2002) Las Palmas de Gran Canaria, Spain
  41. Automatic Acquisition of Domain Knowledge for Information Extraction    (ps.gz)
    Roman Yangarber, Ralph Grishman, Pasi Tapanainen, Silja Huttunen.
    In Proceedings of the 18th International Conference on Computational Linguistics: COLING-2000 (2000) Saarbrücken, Germany
  42. Machine Learning of Extraction Patterns from Un-annotated Corpora    (pdf)
    Roman Yangarber, Ralph Grishman.
    In Proceedings of the 14th European Conference on Artificial Intelligence: ECAI-2000 Workshop on Machine Learning for Information Extraction (2000) Berlin, Germany
  43. Extraction Pattern Discovery through Corpus Analysis    (doc)
    Roman Yangarber, Ralph Grishman.
    In Proceedings of the 2nd International Conference on Language Resources and Evaluation: LREC-2000 Workshop: Information Extraction meets Corpus Linguistics (2000) Athens, Greece
  44. Unsupervised Discovery of Scenario-Level Patterns for Information Extraction    (ps.gz)
    Roman Yangarber, Ralph Grishman, Pasi Tapanainen, Silja Huttunen.
    In Proceedings of Conference on Applied Natural Language Processing ANLP-NAACL 2000 pp. 282-289, (2000) Seattle, WA
  45. Issues in Corpus-Trained Information Extraction    (doc)
    Ralph Grishman, Roman Yangarber.
    In Proceedings of International Symposium: Toward the Realization of Spontaneous Speech Engineering, pp. 107-112, (2000) Tokyo, Japan
  46. Transforming Examples into Patterns for Information Extraction    (ps.gz)
    Roman Yangarber, Ralph Grishman.
    In Proceedings of TIPSTER Text Program Phase III, Morgan Kaufmann (1998) Baltimore, MD
  47. Japanese IE System and Customization Tool    (ps.gz)
    Chikashi Nobata, Satoshi Sekine, Roman Yangarber.
    In Proceedings of TIPSTER Text Program Phase III, Morgan Kaufmann (1998) Baltimore, MD
  48. Deriving Transfer Rules from Dominance-Preserving Alignments    (ps.gz)
    Adam Meyers, Roman Yangarber, Ralph Grishman, Catherine Macleod, Antonio Moreno-Sandoval.
    In Proceedings of COLING-ACL-98 (1998) Montreal, Canada
  49. Using NOMLEX to Produce Nominalization Patterns for Information Extraction    (ps.gz)
    Adam Meyers, Catherine Macleod, Roman Yangarber, Ralph Grishman, Leslie Barrett, Ruth Reeves.
    In Proceedings of COLING-ACL-98 Workshop on Computational Treatment of Nominals, (1998) Montreal, Canada
  50. NYU: Description of the Proteus/PET System as Used for MUC-7 ST    (ps.gz)
    Roman Yangarber, Ralph Grishman.
    In Proceedings of the 7th Message Understanding Conference: MUC-7 (1998) Washington, DC
  51. Customization of Information Extraction Systems    (ps.gz)
    Roman Yangarber, Ralph Grishman.
    In Proceedings of International Workshop on Lexically-Driven Information Extraction, invited talk, pp. 1-11, (1997) Frascati, Italy
  52. Alignment of Shared Forests for Bilingual Corpora    (ps.gz)
    Adam Meyers, Roman Yangarber, Ralph Grishman.
    In Proceedings of the 16th International Conference on Computational Linguistics: COLING-96 pp. 460-465 (1996) Copenhagen, Denmark
  53. ThinkSheet: A Tool for Tailoring Complex Documents   
    Peter Piatko, Roman Yangarber, Daoi Lin, Dennis Shasha.
    ACM SIGMOD '96, demonstration (1996) Montreal, Canada

Edited Collections

Multi-source, Multilingual Information Extraction and Summarization
Thierry Poibeau, Horacio Saggion, Jakub Piskorski, Roman Yangarber (eds.)
Theory and Applications of Natural Language Processing. Springer-Verlag (2012) Berlin, Heidelberg

MINUCS-2009: Mining User-Generated Content for Security.
Ulf Brefeld, Jakub Piskorski, Roman Yangarber (eds.)
Proceedings of the Workshop at the UCMedia-2009: ICST Conference on User-Centric Media (2009) Venice, Italy

High-Level Information Extraction   
Sebastian Blohm, Ulf Brefeld, Felix Jungermann, Roman Yangarber (eds.)
Proceedings of the Workshop at ECML/PKDD-2008: the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (2008) Antwerp, Belgium

Multi-source, Multilingual Information Extraction and Summarization   
Thierry Poibeau, Horacio Saggion, Roman Yangarber (eds.)
Proceedings of MMIES-2: the Second Workshop on Multi-Lingual, Multi-Source Information Extraction and Summarization, at COLING-2008: the 22nd International Conference on Computational Linguistics (2008) Manchester, United Kingdom

Information Extraction Beyond The Document   
Mary Elaine Califf, Mark A. Greenwood, Mark Stevenson, Roman Yangarber (eds.)
Proceedings of the Workshop at ACL/COLING (July 2006) Sydney, Australia

Invited Presentations

Information-theoretic modeling of etymological sound change
Invited speaker at Workshop on comparing approaches to measuring linguistic differences (24-25 October, 2011) University of Gothenburg, Sweden

Discovering complex events and relations in text: Frontex real-time news event extraction framework
Invited speaker at Tutorial for Member States: Frontex news event extraction framework and Frontex Media Monitor (1-2 December, 2011) Frontex EC Agency, Warsaw, Poland.

Information-theoretic models for aligning Uralic etymological data
Invited speaker at Biological Evolution and the Diversification of Languages (BEDLAN) Seminar: Evolutionary Perspectives Of Language Change (21-23 September, 2011) Seili, Finland.

Discovering complex networks of events and relations in News Surveillance    (video)
Keynote speaker at the 4th International Symposium on Open Source Intelligence and Web Mining (OSINT-WM) in conjuction with the European Conference on Intelligence and Security Informatics (European ISI 2011) (12-14 September, 2011) Athens, Greece.

Finding Facts from Text—Information Extraction Technology    (slides.pdf)
Invited speaker at EC-JRC European Commission's Directorate General Joint Research Centre, (30 August, 2006) Ispra, Italy.

Acquisition of Domain Knowledge
Invited speaker at SCIE-2002: 3rd Summer Convention on Information Extraction (July, 2002) University of Rome Tor Vergata, Italy.