Graphical Models
Course Material
Lectures
All lectures are given from the slides provided here. Please print your own copies before the lectures
using the 2up versions provided).
The lecture itself will sometimes digress to web-based material and Java applets for
demonstrations. These will usually be listed on this page, but
not on the PDF slides.
Lecture 1: Introduction: Graphs and Probability
Review of some famous graphical models including circuits, fault
trees, entity-attribute diagrams,
and ising lattices. Review of graphs including definitions for
directed and undirected graphs, relations such as parents and
ancestors, and connectivity notions. Review of probability as
frequency or belief in both discrete and continuous contexts.
We may play with the Change of variables
applet and the
Integration applet
to better understand the implications of working with continuous domains.
Review of modelling as a key part of intelligent systems design.
Examples of using graphs to model dependence, causality and relevance.
[Aug. 24th, Lecture 1 in PDF,
in Postscript, 2-up, gzipped ]
Preparatory reading: Please review the material
under graphs from Math World (know these definitions), the Wikipedia (only graph coloring
should be understood, all others briefly browse) and the Ising Model.
Review course notes of
Three Concepts: Probability
(in particular, Set 2 and 3 from the Material).
Lecture 2: Independence Modelling
Functional independence in optimization as a method for task
simplification. Graph partitioning as a means of problem
decomposition. Probabilistic independence as functional independence
in probability space. Undirected graphs, relevance, and independence
defined as set separation.
Clique trees as undirected graphs on variable sets,
and a normalised form for them called Almond trees which are a tree-structured
generalization of binary independence.
Directed graphs, dependence, and
independence defined on the underlying undirected graph. Soundness
and completeness between graphical independence and functional
independence forms. A catalogue of graphical forms for time-series,
spatial modelling, diagnosis and fault detection and their
independence properties.
[Sept. 30th, Lecture 2 in PDF,
in Postscript, 2-up, gzipped ]
Preparatory reading: Please review
Shachter's manuscript "Examples without Numbers" (local copy)
and glance briefly at
Scheines'
tutorial on d-separation. You don't have to understand d-separation,
because its difficult and we'll use a Lauritzen-style definition instead,
but have a quick look at it.
Part I of Bishop's tutorial (see below) is good material for this lecture.
A demo will also be giving using
B-Course.
Lecture 3: Exact Inference
Given a graphical model on discrete variables of low dimension, and a
probability formula, evaluate the probability, or find the maximum
likelihood assignment for the free variables. We will consider these two
problems in a general sense.
We will develop an exact algorithm for the computation as nested
summation on a clustered graph using variable elimination as the basic step.
This turns out to create a junction tree which we can turn into an
Almond tree, the general independence structure from the last lecture.
Thus we will consider finding good junction trees, and their complexity.
We will not cover implementation details for the algorithm used to
speed it up in practice.
[Oct. 1st, Lecture 3 in PDF,
in Postscript, 2-up, gzipped]
- Variable elimination demo.
Warning: uses IP addresses to distinguish users so expect trouble/conflicts if
you are using a proxy for internet access.
Preparatory reading: TBD
Lecture 4: Approximate Inference
We will consider some basic types of approximate inference
on graphs including Gibbs sampling, rejection sampling
importance sampling.
For Gibbs sampling, will will review
OpenBugs,
maintained at the local Statistics Department.
Gibbs sampling will be compared with local search, a popular
method in optimization and constraint satisfaction.
[Oct. 6th, Lecture 4 in PDF,
in Postscript, 2-up, gzipped]
Preparatory reading:
We will look at examples from
OpenBugs,
given in Examples:Examples Volume I, Pump and Dogs.
The
Ising Model is another example of Gibbs, in fact it is the inverse of
2-coloring.
Lecture 5: Basic Learning and Graphs
We will consider a simple case of learning:
proportions using the Beta
Coin Experiment applet from the Prob./Stats Object Library.
This example is
covered in detail because it forms the basis of many algorithms.
This introduces the notions of maximum likelihood,
maximum aposterior, expected value,
sufficient statistics,
exponential family, the trade-off
between complexity and predictive accuracy, and illustrates the
problems with priors.
The notions of maximum likelihood, maximum aposterior,
sufficient statistics,
exponential family, and problems with priors will be covered in
more detail, making use of graphs as a tool to illustrate the
variations.
Preparatory reading: Notes of
Three Concepts: Probability
Set 4
on proportions.
Here is a presentation version.
Lecture 6: Selected Applications and Review
Please bring you questions about the assignments.
We will review any material necessary so that everyone
has the background they need to complete things.
Some applications may also be covered.
Preparatory reading: None for this week.
Literature, further reading
There is no need to purchase any books. We will
supply copies of any necessary material prior to lectures. Material
available online will sometimes be in Adobe's PDF or Postscript format
so you should be able to view these formats. Moreover, some of the
demonstrations require Java capable browsers. Other than the use of
these long-time standard formats, we make efforts to avoid online
material that requires special browsers (e.g., thus material converted
from Powerpoint is avoided). Note when we say something is a
reference book, we mean you should not use it to learn the
material, but only to check details or find more advanced material.
Graphs
-
Graph Theory from Math World
including
graph,
directed acyclic graph,
forest,
tree,
degree,
outdegree and indegree,
clique,
induced subgraph, and
connected
component.
Some additional material we will touch on and you probably wont know
is
cut point,
mincut,
minimum spanning tree,
random graph, and phase transition.
What is missing strangely is the notion of width and
parent or ancestor relations.
- The general notion of constraint satisfaction is often presented in a graphical context,
including graph coloring
as a special case.
Note the gory proofs about average case analysis of graph coloring originated in
Turner, "Almost All k-Colorable Graphs are Easy
to Color", Journal of Algorithms, 1988.
- Descriptions of venerable graph based models can be browsed on the
Wikipedia
including Unified Modeling Language,
Entity-relationship model,
Data flow diagram,
Semantic network
(e.g., WordNet),
Decision tree,
Influence diagram,
and Circuit diagram.
-
Ising Model
is an undirected graph in a grid where all points are
connected to their horizontal and vertical neighbors. It works like a
probabilistic version of the game of life.
-
“Why is Automatic Test Pattern Generation (ATPG) easy?”
(PDF, 627kB) subject is
nothing to do with this course, but the paper illustrates the notions
of tree-width using a heuristic proxy called "cut-width" (a safe upper
bound) to tree-width we define later. The hardness of problems in
practice is shown as a function of this in a real-world context. Skim
read this; you aren't supposed to understand the details.
-
Research on phase transitions in NP-complete problems,
the slide set from "Where the Really Hard Problems Are"
by P. Cheeseman, B. Kanefsky, and W. Taylor, in IJCAI 1991
as in informal explanation of properties of optimization in graph problems.
- “A tourist guide through treewidth”
(PDF, 1311kB), by Bodlaender. Accessible but
much more detail than we need. Yet shows you the breadth and use of
the computational/optimization side of graphical modelling.
Probabilistic graphs get a mention under the old tag of “expert
systems”.
Probability
Graphical Models
-
“Probabilistic Graphical Models Part One: Graphs and Markov Properties
”
(PDF, 890kB),
and “Probabilistic Graphical Models Part Two: Inference and Learning
”
(PDF, 1394kB), by C.M. Bishop.
Good tutorial overlapping quite a lot of the course.
-
Heckerman's
tutorials,
especially “A Tutorial on Learning with Bayesian
Networks” and “Technical overview for statistician (includes
undirected graphs) - slides from Valencia 2002 tutorial”.
Used as a reference in the learning lectures.
-
Friedman's
AAAI-98 tutorial with Goldszmidt,
covers different material to his NIPS-2001 tutorial,
though the NIPS-2001 tutorial is more recent.
- A Brief Introduction to Graphical Models and Bayesian Networks, Kevin Muphy's
tutorial from 1998.
-
B-Course developed in the department.
Used in the learning lectures.
- Learning in Graphical Models edited by M.I. Jordan, 1999
(available in Computer Science Department library). An advanced
reference book. Individual
chapters also
available on-line.
- M.I. Jordan, “Graphical Models”
(PDF, 1239kB), Statistical Science, 2003.
- Graphical Models by S.L. Lauritzen, 1996 (available in
Computer Science Department library). An advanced reference book.
Lauritzen has been instrumental in a lot of early developments.
- R. McEliece and S. M. Aji, 2000. The Generalized Distributive Law, IEEE Trans. Inform. Theory, vol. 46, no. 2 (March 2000), pp. 325--343. Overview and reference for exact computation.
- Graphical Models in Applied Multivariate Statistics by
Whitaker, 1990 (available in Valtiotieteellisen
tiedekunnan kirjasto). A good reference book from the statistical
side, intended as an advanced text for graduates.
- “A catalogue of graphs”
(PDF, 1318kB),
by Studeny.
Ignore the first 8 pages of theory, the rest catalogues
different types of directed and undirected graphs over 2-5 variables.
In general, Studeny is an excellent source of theoretical
work on independence models, probability and graphs.
- Graphical Models Discussion Papers edited by Joel Chenu, another reference, collection
of research-oriented tutorials.
-
Cecil Huang and Adnan Darwiche.
Inference in belief networks: A procedural guide.
(PDF, 451kB)
International Journal of Approximate Reasoning, 15(3):225-263,
October, 1996.
-
Rina Dechter's Network-Based Reasoning - Bayesian Networks
course at UCI.
References
Only use as references if needed.
-
Introduction to Probability Models by S.M.~Ross,
Academic Press, 1989 (4th edition).
A good introduction to probability for computer scientists
with basic distributions, Markov models, queuing and reliability
theory and simulation.
- Stochastic Simulation by B.D. Ripley,
John Wiley & Sons, 1987.
A good introduction to sampling algorithms.
- Multivariate Analysis by K.V. Mardia, J.T. Kent and J.M. Bibby,
Academic Press, 1979.
Introduction to multivariate distributions and their analysis
based on vector and matrix algebra.
- Matrix Computations by G.H. Golub and C.F. Van Loan,
Johns Hopkins University Press, Baltimore, 1989 (2nd edition).
Introduction to computation on matrices,
SVD's, inverses and lots more.
- Elements of Information Theory by T.M. Cover and J.A. Thomas,
John Wiley & Sons, 1991.
Introduction to everything about information theory.
- Practical Optimization by P.E. Gill and W. Murray and M.H. Wright,
Academic Press, 1981.
Introduction to numerical optimization.
Toolboxes/Software
Things we use and recommend:
Other good software.