Welcome to Resources of Text Categorization
On Line Papers
Overview and Feature selection
Support Vector Machines
Lewis, Representation and Learning in Information Retrieval. PhD thesis,
Department of Computer Science; Univ. of Massachusetts; Amherst, MA 01003,
and Xin Liu A re-examination of text categorization methods. Proceedings
of ACM SIGIR Conference on Research and Development in Information Retrieval,
Pedersen J.P. A Comparative Study on Feature Selection in Text Categorization
Proceedings of the Fourteenth International Conference on Machine Learning
Joachims , Text Categorization with Support Vector Machines: Learning
with Many Relevant Features. European Conference on Machine Learning (ECML),
Claire Nédellec and Céline Rouveirol (ed.), 1998.
Robert Cooley , Classification
of News Stories Using Support Vector Machines (1999). Proceedings of the
Sixteenth International Joint Conference on Artificial Intelligence Text
Mining Workshop, August 1999.
S. Dumais and
H. Chen, Hierarchical classification of Web content. Proceedings of SIGIR'00,
August 2000, pp. 256-263.
K Nearest Neighbor
Andrew McCallum and Kamal
Nigam, A Comparison of Event Models for Naive Bayes Text Classification.
AAAI-98 Workshop on "Learning for Text Categorization"
H. Friedman , J. H. "Flexible Metric Nearest Neighbor Classification."
Technical Report (Nov. 1994).
New Event Detection or Topic Detection
C. Apte ,F. Damerau,
and S.M. Weiss, Text Mining with Decision Trees and Decision Rules, in
Conference on Automated Learning and Discovery, Carnegie-Mellon University,
C. Apte , F. Damerau,
and S.M. Weiss, Towards Language Independent Automated Learning of Text
Categorization Models, in ACM SIGIR'94, July 1994.
Schapire and Yoram Singer, BoosTexter: A boosting-based system for
text categorization. Machine Learning, to appear.
, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang, "Topic Detection
and Tracking Pilot Study: Final Report". Proceedings of the DARPA Broadcast
News Transcription and Understanding Workshop, pp. 194-218. (April 1998)
, R. Papka, and V. Lavrenko, "On-line New Event Detection and Tracking",
in SIGIR '98. (April 1998)
Chris Clifton, Robert Cooley
, TopCat: Data Mining for Topic Identification in a Text Corpus (1999).
Proceedings of the 3rd European Conference of Principles and Practice of
Knowledge Discovery in Databases, 1999.
Doug Baker, Thomas Hofmann, Andrew
McCallum and Yiming Yang, A Hierarchical Probabilistic Model for Novelty
Detection in Text. Submitted to NIPS'99.
Kamal Nigam, John Lafferty, Andrew
McCallum , Using Maximum Entropy for Text Classification. IJCAI'99
Workshop on Information Filtering.
Soumen Chakrabarti ,
Byron Dom, Rakesh Agrawal, and Prabhakar Raghavan, Scalable feature selection,
classification and signature generation for organizing large text databases
into hierarchical topic taxonomies. International Journal on Very Large
Data Bases, 7(3) pp163-178. Invited paper.
, S. Zhou, S.C. Liew, "Building hierarchical classifiers using class proximity",
VLDB 1999, September 1999, Edinburgh, UK, Morgan Kaufmann, 363-374.
and M. Sahami, Hierarchically classifying documents using very few words,
. Proceedings of the 14th International Conference on Machine Learning
(ICML), Nashville, Tennessee, July 1997, pages 170--178.
Back to Top
Machine Learning Resources
Back to Top
ML Papers (Andrew Ng)
on Automated Text Categorization, from The Collection of Computer Science
Back to Top
On Line Softwares
Back to Top
Mail to : mzhang@cs.Helsinki.FI
Back to Zhang