Learning the dissimilarity between objects described by categorical attributes

Event type: 
Guest lecture

Learning the dissimilarity between objects described by categorical attributes

Abstract: Learning the dissimilarity between objects described by categorical attributes is a challenging task. This is an important problem in many data mining applications that span from unsupervised to supervised learning and anomaly detection. Unlike with numerical attributes, whose values are ordered, with categorical attributes it is difficult to define the proximity measure of objects because the distance between pairs of attribute values is not easy to define.  We propose a framework to learn a context-based distance for categorical attributes. The key intuition of this work is that the distance between two values of a categorical attribute can be determined by the way in which the values of the other attributes, the context attributes, are distributed in the data-set objects. In the talk we will discuss some solutions to the critical point of the choice of the context attributes.  We validate our approach in different applications: clustering and anomaly detection. Experimental results show that our method is competitive w.r.t. the state of the art of categorical data clustering and anomaly detection. We also show that our approach is scalable and has a low impact on the overall computational time.

Guest lecture, 30 Sep 3:15 pm, room B119 (Exactum, Department of Computer Science)

Welcome!

Hannu Toivonen

 

Event time: 
30.09.2010 - 15:15 - 16:00
Lecturer : 
Prof. Rosa Meo, University of Torino, Italy
Place: 
Exactum B119
19.11.2013 - 18:30 Webmaster
23.09.2010 - 11:03 Hannu Toivonen