Modelling and Analysis in Bioinformatics
Vuosi | Lukukausi | Päivämäärä | Periodi | Kieli | Vastuuhenkilö |
---|---|---|---|---|---|
2015 | syksy | 26.10-10.12. | 2-2 | Englanti | Veli Mäkinen |
Luennot
Aika | Huone | Luennoija | Päivämäärä |
---|---|---|---|
Ma 12-14 | B119 | Veli Mäkinen | 26.10.2015-10.12.2015 |
To 10-12 | B119 | Veli Mäkinen | 26.10.2015-10.12.2015 |
Harjoitusryhmät
Aika | Huone | Ohjaaja | Päivämäärä | Huomioitavaa |
---|---|---|---|---|
To 12-14 | B221 | Veli Mäkinen | 26.10.2015—04.12.2015 |
Yleistä
The course explores computational models for biological networks, including e.g. network motifs and gene regulation, and introduces probabilistic analysis of sequence-level problems in fragment assembly, pattern matching, and motif discovery. The course is lectured by Leena Salmela, Antti Honkela, and Veli Mäkinen.
Kurssin suorittaminen
The course consists of lectures, study groups and programming exercises. Attendance in the study groups and visiting lectures is mandatory. In case you cannot attend a study group or a visiting lecture, contact the lecturers for an alternative assignment. Python language is used for the programming exercises.
UPDATE: Lectures for the last two weeks will be cancelled. The corresponding exercises will be replaced by an additional learning diary on the visiting lectures.
Schedule
-
26.10.-30.10. Global network models (Salmela)
- Monday 26.10. Lecture [Slides]
-
Thursday 29.10. 10-12 Study group on properties of ER graphs (material: Blum, Hopcroft, Kannan: Foundations of Data Science, Chapter 4 Random graphs)
- Everybody reads beginning of section 4.2.: pages 77-79
- Group 1: Threshold for diameter two: pages 79-82
- Group 2: Disappearance of isolated vertices and Hamilton circuits: pages 82-84
- Group 3: Full connectivity: pages 100-102
-
Thursday 29.10. 12-14 Exercise session
- Exercise sheet (sneak peak in HTML)
- Deadline: 5.11.
- More information on completing the exercise and IPython Notebook below
-
2.11.-6.11. Network motifs (Salmela)
- Monday 2.11. Lecture [Slides]
-
Thursday 5.11. 10-12 Study group on algorithms for finding network motifs
-
Group 1 (Students whose first name starts with A-L):
- F. Schreiber and H. Schwöbbermeyer: Frequency concepts and pattern detection for the analysis of motifs in networks. Trans. on Comput. Syst. Biol: III, pp. 89--104, 2005.
- Concentrate on section 4.
-
Group 2 (Students whose first name starts with M-Z):
- N. Kashtan, S. Itzkovitz, R. Milo and U. Alon: Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20(11):1746--1758, 2004.
- Concentrate on the Methods section.
-
Group 1 (Students whose first name starts with A-L):
-
Thursday 5.11. 12-14 Exercise session
- Exercise sheet (sneak peak in HTML)
- Deadline: 13.11.
-
9.11.-13.11. Biology of gene regulation, simulating gene regulation (Honkela)
- Monday 9.11. Lecture [Slides]
-
Thursday 12.11. 10-12 Study group on algorithms for simulating biochemical reactions
-
Papers:
- Gillespie D. Exact stochastic simulation of coupled chemical reactions. Journal of physical chemistry 81(25):2340--2361, 1977.
-
Gillespie D. The chemical Langevin equation. The Journal of chemical physics 113(1):297--306, 2000.
-
Tasks:
-
Group 1: Read Gillespie (1977), especially Secs. I, IIIB, IIIC
-
Group 2: Read Gillespie (2000), especially Secs. I, II, III
-
Group 3: Read Gillespie (2000), especially Secs. I, II, IV
-
-
Papers:
-
Thursday 12.11. 12-14 Exercise session
- Exercise sheet (sneak peek in HTML)
- Deadline: Friday 20 November
-
16.11.-20.11. Gene regulatory network inference (Honkela)
- Monday 16.11. Lecture [Slides]
-
Thursday 19.11. 10-12 Study group on gene regulatory network inference
-
Paper:
D. Marbach et al.
Wisdom of crowds for robust gene network inference.
Nature Methods 9(8):796-804 (2012). -
Tasks:
- Read the paper to form an overview of the topic
- It is not necessary to understand all the details!
-
Paper:
-
Thursday 19.11. 12-14 Exercise session
- Exercise sheet (sneak peek in HTML)
- Deadline: Friday 27 November
-
23.11.-27.11. Visiting lecturers:
-
Mon 12-13 Merja Oja: "Metabolic modelling in industrial biotechnology"
- Additional reading: Orth, J.D.m Thiele, I., & Palsson, B.Ø. (2010). What is flux balance analysis? Nat Biotechnol, 28(3), 245-248.
-
Mon 13-14 Juho Rousu: "Metabolite Identification through Machine Learning"
- Additional reading: Dührkop, K., Shen, H., Meusel, M., Rousu, J., & Böcker, S. (2015). Searching molecular structure databases with tandem mass spectra using CSI: FingerID. Proceedings of the National Academy of Sciences, 112(41), 12580-12585.
- Thu 10-11 Manu Tamminen: "Why networks are useful in microbiology?"
- Thu 11-12 Harri Lähdesmäki: "High-resolution models for transcription factor binding and transcriptional regulation"
-
Mon 12-13 Merja Oja: "Metabolic modelling in industrial biotechnology"
- 30.11.-4.12. CANCELLED: Modelling genomes, random projections for motif discovery (Mäkinen)
- 7.12.-11.12. CANCELLED: Modelling sequencing, analysing complexity of fragment assembly (Mäkinen)
Exercises
You can work on the exercises with a pair or alone. Submit your solutions as an ipynb file using Moodle.
The exercises consists of small programming projects in Python. We will use Python version 3 for the exercises. Exercises are given as IPython Notebook documents that you should complete to include your solutions. The IPython Notebook environment is preinstalled on the Linux workstations and you can also install it on your own computer. To get started with the exercises:
- Create a directory for your notebooks
- Copy the exercise file into that directory
- Open a terminal and move to the directory
- Run 'ipython3 notebook'
This will start the IPython Notebook system and open a web browser for you in which you can start working on the exercises. When you are done, close the web browser and issue Ctr-C twice in the terminal window to shutdown the environment.
Grading
To pass the course:
- Attend study groups and visiting lectures
- Submit the programming exercises and get at least 6 points in each of the three exercise sets (network models, gene regulation, probabilistic analysis of sequence-levels problems)
- UPDATE: To replace the exercises for the last two weeks, you will also need to write a learning diary on the visiting lectures. (More instructions later.)
The course will be graded in the scale 1-5. Grading is based on the submitted programming exercises. In total 40 (was: 60) points will be available. To pass the course you must get at least 20 (was: 30) points and a grade of 5 will require 34 (was: 50) points. If the exercises prove to be very difficult, these limits may be lowered.
The course does not include an exam and it is not possible to pass the course with a separate exam.