University of Helsinki Department of Computer Science
 

Department of Computer Science

Department information

 

Improved Methods for Finding Association Rules

Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo: Improved Methods for Finding Association Rules. Report C-1993-65, Department of Computer Science, University of Helsinki, February 1994. 20 pages. <http://www.cs.helsinki.fi/TR/C-1993/65>

Full paper: gzip'ed Postscript file
Metadata: XML file

Abstract

Association rules are statements of the form ``for 90 % of the rows of the relation, if the row has value 1 in the columns in set W , then it has 1 also in column B''. Agrawal, Imielinski, and Swami introduced the problem of mining association rules from large collections of data, and gave a method based on successive passes over the database. We give an improved algorithm for the problem. The method is based on careful combinatorial analysis of the information obtained in previous passes; this makes it possible to eliminate unnecessary candidate rules. Experiments on a university course enrollment database indicate that the method outperforms the previous one by a factor of 5. We also give simple information-theoretic lower bounds for the problem of finding association rules, and show that sampling is in general a very efficient way of finding such rules.

Index Terms

Categories and Subject Descriptors:
H.3.3 [Information Systems]: Information Storage and Retrieval - Information Search and Retrieval
I.2.6 [Computing Methodologies]: Artificial Intelligence - Learning
I.2.8 [Computing Methodologies]: Artificial Intelligence - Problem Solving, Control Methods, and Search

General Terms: Databases, machine learning, artificial intelligence

Additional Key Words and Phrases: Database mining, knowledge discovery in databases, association rules, covering sets


Online Publications of Department of Computer Science, Anna Pienimäki