A Knowledge Discovery Methodology for Telecommunication Network Alarm DatabasesMika Klemettinen: A Knowledge Discovery Methodology for Telecommunication Network Alarm Databases. PhD Thesis, Report A-1999-1, Department of Computer Science, University of Helsinki, January 1999. 135 pages. <http://www.cs.helsinki.fi/TR/A-1999/1> Full paper: gzip'ed Postscript file AbstractData mining (knowledge discovery in databases, KDD) aims at the discovery of interesting regularities or exceptions from large masses of data. The area combines methods from databases, machine learning, and statistics. Fault management is an important but difficult area of telecommunication network management: networks produce large amounts of alarm information which must be analyzed and interpreted before faults can be located. Alarm correlation is a central technique in fault identification. While the alarm correlation systems are widely used and methods for expressing the correlations are maturing, acquiring the knowledge necessary for constructing an alarm correlation system for a network and its elements is difficult. We describe a partial solution to the task of knowledge acquisition for alarm correlation systems. We present a method and a tool for the discovery of recurrent patterns of alarms in databases. These patterns can be used in the construction of real-time alarm correlation systems. The main type of patterns we look for are episode rules which have the following form: "If a certain combination of alarms occurs within a time period, then another combination of alarms will occur within a time period with a certain probability." Our method finds all episode rules with respect to the given, rather loose criteria. With the described method and tool the construction of correlation systems becomes easier. This methodology has been implemented in a data mining system called TASA (Telecommunication Alarm Sequence Analyzer). One of the main contributions of this thesis is the description of the complete interactive and iterative knowledge discovery process from both practical and analytical point of view. Discovering knowledge from data is a process containing several steps: understanding the domain, preparing the data set, discovering patterns, postprocessing of discovered patterns, and putting the results into use. Although the main emphasis of the thesis is in knowledge discovery from telecommunication data, the methods apply also to other types of data. The work with the knowledge discovery process and large, real-life data sets, has directed our research towards inductive databases, i.e., databases that contain inductive generalizations about the data, in addition to the usual data. Within inductive databases, the KDD process can be described as a sequence of queries. Index Terms
Categories and Subject Descriptors:
General Terms: Algorithms, Experimentation, Design Additional Key Words and Phrases: Knowledge discovery process, Telecommunications, Episode rules, Interestingness, Inductive databases |
Online Publications of Department of Computer Science, Anna Pienimäki