582481 Causal analysis (4 - 6 cr)
3.9.2008 - 17.10.2008 + optional project work after the courseThis course probes the main problems of causal analysis: identifying cause and effect, and using such knowledge for the purpose of prediction and in particular for decision-making. See below for a more detailed description of the course schedule and contents.
News
Announcements etc will be placed here. They will also be emailed to registered course participants.
11.11.2008 - Some basic statistics on the course this autumn is as follows...
Originally registered for course: 27
Followed the lectures: approx. 15
Took the exam 17.10.2008: 19
Passed the exam 17.10.2008: 13
Currently participating in the project work: 15
4.11.2008 - Please remember to give feedback for the course!
4.11.2008 - The results from the Oct 17, 2008 exam are now available in the department intranet. See the email sent today to course participants for more information.
24.10.2008 - The project work description is now available. If you have informed me that you are taking part in the project work you should have received an email from me with the URL of your data.
(Older and no longer relevant announcements have been deleted.)
Time, place, and language
Autumn 2008, first period (+ optional project work in the second period).
Lectures:
D.Sc.(Tech) Patrik Hoyer
Wed 10-12, Fri 10-12 (Exactum C221)
(See full schedule below!)
Exercise sessions:
M.Sc.(Tech) Antti Hyttinen
Tue 14-16 (Exactum, C220)
(See full schedule below!)
Exam: (preliminary information, subject to change)
Friday 17.10.2008 at
9.00-12.00 in Exactum B123.
Please bring writing utensils and a calculator.
No other material allowed.
(In addition, separate exams
18.11.2008,
6.2.2009,
27.3.2009, and
16.6.2009. If you are planning to
attend one of these separate exams, please register
a couple of weeks before the exam using the department's electronic registration system.)
Full schedule: (L = lecture, E = exercises)
tue wed fri
14-16 10-12 10-12
C220: C221: C221:
sept ---- 3 L 5 L
9 E 10 L 12 L
16 E 17 L 19 L
23 E 24 L 26 L
30 E 1 L 3 L
oct 7 E ---- 10 L
---- ---- 17 Exam (at 9.00-12.00, Exactum B123)
Language:
Everything (i.e. lectures, exercise sessions, lecture notes, exercises and solutions,
other required material, exam,
and optional project work) will be in English. Those who prefer to do so may answer the
exam questions (and complete the project work) in Finnish or Swedish.
Causality?
"In the last decade, owing partly to advances in graphical models, causality has undergone a major
transformation: from a concept shrouded in mystery into a mathematical object with well-defined
semantics and well-founded logic. Paradoxes and controversies have been resolved, slippery concepts
have been explicated, and practical problems relying on causal information that long were regarded
as either metaphysical or unmanageable can now be solved using elementary mathematics. Put simply,
causality has been mathematized."
-Judea Pearl in the
preface to his book 'Causality'
Cause-effect relationships (causal relationships) are an integral part of our worldview. They are often the targets of scientific investigation, but they are also part of our everyday language. We are interested in knowing if carbon dioxide causes global warming, if smoking leads to cancer, and if cheaper liquor increases the number of people who drown in summer. Also when we ask, "would I have been able to catch the bus, had I ran?" or "will people emigrate if we raise the taxes?", the answers depend on what causal relationships exist in our world.
Causality is one of the oldest concepts in philosophy. Various definitions of what constitutes a cause-effect relationship have been put forth, and the question of if (and when) one can actually identify such a relationship has been studied extensively. One of the main problems has been how to take into account uncertainties. For example, although excessive smoking does not necessarily lead to cancer, we would still like to say that smoking is a cause (among many) of cancer, if it significantly increases the probability of getting ill.
Probability theory (and statistics) is the language with which present-day science
describes uncertainty. But it should be noted that this language does not, in its
traditional form, include any references to causality. We can express that certain
events happen together, and say how frequently they happen, but we cannot describe that
one event actually causes another. This is changing, as in the last 15-20 years
a lot of work has been done on extending the standard notation and methods to take
into account cause-effect relationships. Two recent books on the subject are
- Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press.
-
Spirtes, P., C. Glymour, R. Scheines (1993). Causation, Prediction, and Search.
Springer.
This course describes the basic ideas of these, and other, recent contributions. All the necessary material can be found in the lecture slides and in online other required readings, so there is no need to buy either of the books.
Prerequisites
To understand all the material, you need to have a good grasp of the basics of probability theory. Some basic concepts of linear algebra and linear regression will also be needed. In the beginning of the course we will very quickly go through the needed concepts so as to refresh your memory (and fix the notation) but unless these topics are familiar to you it might be hard to follow the course.
Previous exposure to graphs and in particular Bayesian networks would be helpful, but is definitely not required.
Please try this simple quiz to see how well you know/remember the prerequisites. Note that the quiz is completely anonymous and the results are for your information only; passing the quiz is not necessary for taking the course!
Those who are interesting in the subject of causality (but do not fulfill the prerequisites for the course) I encourage to have a look at the free and open online courses 'Causal and Statistical Reasoning' and 'Empirical Research Methods' described below under the heading 'Supplementary material'.
Lecture slides & other required (and recommended) readings
The lecture slides will appear here, as well as all other required material. My intent is that the slides will be here at least a day prior to the lecture, so that students may print them and have them available during the lecture.
Lecture slides:
A. Introduction and practical arrangements (3.9.2008)
B. Prerequisites (5.9.2008)
C. Causal models (10.9.2008)
D. Do-calculus (12.9.2008)
E. Linear SEMs (17.9.2008)
[Summary of D & E and examples; example in R] (19.9.2008)
F. Counterfactuals (24.9.2008)
G. Model learning 1 (26.9.2008 and 1.10.2008)
H. Model learning 2 (1.10.2008)
I. Model learning 3 (3.10.2008)
J. Conclusions (10.10.2008)
Other required readings:
(If some of the links stop working, please inform the lecturer immediately!)
[Note: Material in brackets is recommended, not required.]
- 3.9.2008:
Pearl's preface to his book 'Causality'
Preface and Chapter 1 of Dawid's Fundamentals of Statistical Causality.
- 5.9.2008:
Chapter 1, part 1 of Pearl's book.
- 10.9.2008:
Chapter 1, part 2 and chapter 1, part 3 of Pearl's book.
[Chapters 2 and 3 of Dawid's Fundamentals of Statistical Causality.]
- 12.9.2008:
Chapter 3, preface,
chapter 3, part 1,
chapter 3, part 2,
chapter 3, part 3,
chapter 3, part 4, and
chapter 3, part 5 of Pearl's book.
- 17.9.2008:
Chapter 5, part 1, and
chapter 5, part 3 of Pearl's book.
Pages 1-9 of Spirtes: The limits of causal inference from observational data
- 19.9.2008:
No required readings for this lecture.
- 24.9.2008:
Chapter 1, part 4 of Pearl's book,
an email exchange between Pearl and myself (Feb 2006), and
sections 4 and 5 of Glenn Shafer's paper Comments on "Causal Inference without Counterfactuals" by A.P. Dawid.
[Chapters 2 and 3 of Dawid's Fundamentals of Statistical Causality.]
- 26.9.2008:
Chapter 2 (but not sections 2.6-2.8) of Pearl's book, and
'An introduction to Causal Inference' by Richard Scheines.
['An overview of the representation and discovery of causal relationships using Bayesian networks' by Greg Cooper.]
- 1.10.2008:
Wikipedia entry on Pearson's chi-squared test.
- 3.10.2008:
Shimizu et al. (2005): Discovery of non-gaussian linear causal models using ICA.
Supplementary material
Here I will list material which is supplementary and not required for the exam, but hopefully helpful nevertheless.- The free online courses Causal and Statistical Reasoning and Empirical Research Methods of the Open Learning Initiative of Carnegie Mellon University are excellent introductions to the topic of causation vs association. The former is a very basic course that requires practically no former knowledge of statistics; the latter a more applied course tailored to practical data analysis, regression in particular. This online material should be useful for those who do not fulfill the prerequisites of my course (or for some other reason cannot take part in it) but nevertheless are interested in understanding the basics of what it is all about.
Exercise sets
Exercise sets will be placed here. They should be here at least on the Thursday of the week prior to the Tuesday on which they will be gone through in class, so that students have enough time to attempt to solve them on their own. Answers will be posted only after they have been discussed in the exercise sessions.
Exercises 1 (9.9.2008): prerequisites
- Solutions 1 and extras
Exercises 2 (16.9.2008): DAGs, dsep, indep
- Solutions 2
Exercises 3 (23.9.2008): Do-calculus, linear models
- Solutions 3
Exercises 4 (30.9.2008): Counterfactuals
- Solutions 4
Exercises 5 (7.10.2008): Model learning
- Solutions 5
Project work
The optional project work description is available here. The data to be analyzed is individual and the directory for each student has been sent by email to the students taking part in the project work.The DL for returning the project work is November 30, 2008, at 23:59 Helsinki time. Please see the project work description for more detailed information.
Enrolling
Please register for the course using the department enrollment system. (Note: 'English' button at top-right.)
If you are not a student at Helsinki University and you would like to take part in the class, please contact the lecturer, either by email or at the first lecture.
Passing the course
The course (4 cr) is passed by successfully taking the exam, which tests how well the students have understood the lecture slides, the other required readings, and the exercises. Attending the lectures and the exercise sessions is voluntary but highly recommended.
By successfully completing the project work you may obtain an additional 2 credits (i.e. making the course worth a total of 6 cr). The project work will be made available at the end of the lectures (mid-October) and the deadline for finishing it will be the end of November. The project will consist of the practical analysis, using various existing software, of data made available by the lecturer.
If you for some reason or other cannot take part in the first exam (17.10.2008) you may nevertheless complete the project and then take the exam in one of the later possibilities. Note, however, that no credits will be awarded until the exam is successfully passed. To obtain the full 6 credits you need to successfully pass both the exam and the project work.
Note that if as a result of a weak project work you would receive a lower combined grade (6 credits) than your grade based on the exam only (4 credits), you will be able to choose which you prefer. There is thus absolutely no risk in attempting the project work!
Questions?
If you have any questions, don't hesitate to contact the lecturer Patrik Hoyer.
Anonymous feedback to the lecturer
You may provide anonymous feedback to the lecturer at any time by filling out this form.

