University of Helsinki Department of Computer Science
 

Department of Computer Science

Department information

 

582 647 Graph Mining (3 cp)

Time: 2-6 March 2009, at 9:00-15:00
Place: Room B222, Department of Computer Science, Exactum Building. Gustaf Hällströmin katu 2b
Teacher: Dr. Christian Borgelt, Intelligent Data Analysis and Graphical Models Research Unit, European Center for Soft Computing, E-33600 Mieres (Asturias), Spain

Coordinator: Greger LIndén

Results

Written exam 28 April 2009
Oral exam 6 or 9 March 2009

Feedback

Please give feedback about the course.

Registration

Please register for the course through the Registration Systems of the Department of Computer Science or through the coordinator.

Course content

This course introduces the core principles and some advanced methods of frequent subgraph mining. It starts by reviewing, from a somewhat unusual perspective, some of the basic principles of frequent item set mining (although without going into details of specific algorithms). The rationale of this review is that certain ideas and methods used in frequent subgraph mining can already be exemplified with frequent item set mining approaches, while the simpler framework helps to introduce them in a way that is easy to understand. These basic ideas include the general scheme of traversing a (semi-)lattice of patterns with the help of a divide-and-conquer procedure, the use of canonical forms to eliminate redundant search, and the use of the apriori property and perfect extensions to prune the search tree.

In a second step these core ideas are transferred to the graph setting, where the most general case (unrestricted graphs) is treated first. Different canonical forms based on spanning trees and on adjacency matrices are discussed, and the so-called prefix property of canonical code words is studied as a means of simplifying the search procedure. In addition, node signatures are considered as a way of accelerating the process of constructing a canonical code word. Furthermore, special extensions of the basic search scheme for the application area of molecular frequent mining, which is a very useful method for drug design, are presented, including ring mining, carbon chain mining, and wildcards atoms. The course concludes with considering restricted graphs like chains and various forms of trees, where special canonical forms allow for more efficient search.

Please see http://www.borgelt.net/teach/fpm/

Course schedule

Lectures: Mon - Thu 9:00-10:30, 10:45-12:15; Fri 9:00-10:30
Exercises: Mon - Thu 13:15-14:45, Fri 10:45-12:15
Oral exam: Fri 13:15-14:45 or Mon (9 Mar, B222) 9:00-13:00
Written exam: Tue 16:00-19:00 (2,5 hours) A111

MondayTuesdayWednesdayThursdayFridayMonday
9:00-10:30LectureLectureLectureLectureLectureOral exam
10:45-12:15LectureLectureLectureLectureExerciseOral exam
13:15-14:45ExerciseExerciseExerciseExerciseOral exam

How to pass the course

There is an oral exam (Friday 6 March, afternoon, or Monday 9 March morning) or written exam (28 April 16-19 at Exactum A111). If you took the oral exam, you can still attend also the written exam.