Juho-Kustaa Kangas defends his PhD thesis on Combinatorial Algorithms with Applications in Learning Graphical Models on December 9th, 2016

M.Sc. Juho-Kustaa Kangas will defend his doctoral thesis Combinatorial Algorithms with Applications in Learning Graphical Models on Friday the 9th of December 2016 at 14 o'clock in the University of Helsinki Exactum Building. Auditorium CK112 (Gustaf Hällströmin katu 2b)His opponent is Senior Lecturer James Cussens (University of York, United Kingdom) and custos Professor Petri Myllymäki (University of Helsinki). The defence will be held in English.

Combinatorial Algorithms with Applications in Learning Graphical Models

Graphical models are a framework for representing joint distributions over random variables. By capturing the structure of conditional independencies between the variables, a graphical model can express the distribution in a concise factored form that is often efficient to store and reason about.

As constructing graphical models by hand is often infeasible, a lot of work has been devoted to learning them automatically from observational data. Of particular interest is the so-called structure learning problem, of finding a graph that encodes the structure of probabilistic dependencies. Once the learner has decided what constitutes a good fit to the data, the task of finding optimal structures typically involves solving an NP-hard problem of combinatorial optimization. While first algorithms for structure learning thus resorted to local search, there has been a growing interest in solving the problem to a global optimum. Indeed, during the past decade multiple exact algorithms have been proposed that are guaranteed to find optimal structures for the family of Bayesian networks, while first steps have been taken for the family of decomposable graphical models.

This thesis presents combinatorial algorithms and analytical results with applications in the structure learning problem. For decomposable models, we present exact algorithms for the so-called full Bayesian approach, which involves not only finding individual structures of good fit but also computing posterior expectations of graph features, either by exact computation or via Monte Carlo methods.

For Bayesian networks, we study the empirical hardness of the structure learning problem, with the aim of being able to predict the running time of various structure learning algorithms on a given problem instance. As a result, we obtain a hybrid algorithm that effectively combines the best-case performance of multiple existing techniques.

Lastly, we study two combinatorial problems of wider interest with relevance in structure learning. First, we present algorithms for counting linear extensions of partially ordered sets, which is required to correct bias in MCMC methods for sampling Bayesian network structures. Second, we give results in the extremal combinatorics of connected vertex sets, whose number bounds the running time of certain algorithms for structure learning and various other problems.

Availability of the dissertation

An electronic version of the doctoral dissertation is available on the e-thesis site of the University of Helsinki at http://urn.fi/URN:ISBN:978-951-51-2725-9.

Printed copies will be available on request from Juho-Kustaa Kangas: juho-kustaa.kangas@helsinki.fi.

06.02.2017 - 15:15 Pirjo Moen
28.11.2016 - 13:44 Pirjo Moen