University homepageSuomenkielinen versio puuttuuInte på svenskaIn english
University of Helsinki Department of Computer Science
 

Department of Computer Science

58305112 Research Seminar on Data Analysis for Bioinformatics (2 cu)

Samuel Kaski

20.01.-28.04. Fri 12-14, lecture hall B119

Language: English

Note: time and location were changed after the first session!

TODO (dated 10.2.):

  • For all: Check your entries in the table below, try to exchange them with somebody if the dates are impossible, and tell Sami the result.
  • For all: Email Sami a brief (half a page) preliminary description of your presentation topic ASAP.

WkDateTheme Brought to you byOpponentNotes by@
3 20.1. Administrative Sami
4 28.1. Ch 2&3 Jaana Heino All Katja Astikainen, Riikka Kaven
5 4.2. Ch 4 Heli Borg Margus Lukk Morris Michael helsinki.fi
6 11.2. Ch 5 Rashi Gupta Morris Michael Jaana Heino helsinki.fi
7 18.2. Promoter analysis in silico Udyant Kumar Jaana Heino Ville Mäkinen helsinki.fi
8 25.2. Ch 6 Katja Astikainen, Riikka Kaven Heli Borg Rashi Gupta helsinki.fi
9 4.3. Ch 8 Morris Michael Rashi Gupta Heli Borg rni.helsinki.fi
10 11.3. Genome-wide scan with SNP markers Tero Hiekkalinna Paula Silvonen Sarish Talikota helsinki.fi
11 18.3. Large p small n Ville Mäkinen Sarish Talikota Paula Silvonen helsinki.fi
12 25.3.No meeting
13 1.4. Proteomics and its mass spectrometric applications Reetta Nylund Abhishek Tripathi Tero Hiekkalinna
14 8.4. Homology modeling for proteins Abhishek Tripathi Reetta Nylund Udyant Kumar iki.fi
15 15.4. Metabolomics Sarish Talikota Ville Mäkinen Margus Lukk
16 22.4. Regulatory mechanisms in cell, searching regulatory factors for genes Margus Lukk Udyant Kumar, Tero Hiekkalinna Reetta Nylund , helsinki.fi
17 29.4. Similarity measures in clustering time series data Paula Silvonen Katja Astikainen, Riikka Kaven Abhishek Tripathi helsinki.fi

(The last column is useful for guessing the e-mail addresses of the opponents.)

Theme: Analysis of high-throughput genomic data

Modern high-throughput measurement techniques and new modeling methods revolutionize biology and medicine, and make possible new approaches sometimes called systems biology. The goal of this seminar is to bring together advanced graduate students and doctoral students interested in analyzing and modeling microarry data and other so-called high-througput data sets and their combinations. The topics of the presentations may range from preprocessing of microarray data to advanced modeling and machine learning, with applications in bioinformatics.

The seminar series has two parts:

  1. Standard seminar. We will start by going through parts of the book Baldi and Hatfield: DNA microarrays and gene expression. From experiments to data analysis and modeling. The book is a good and readable introduction. These presentations are suitable for (relative) beginners in the field.
  2. Research seminar. In these presentations researchers / doctoral students in the field will discuss their own research problems. Topics for beginning researchers will be settled together. Additionally, each student will be allocated a related topic/paper, and they are expected to discuss the relationship of their own work with the proposed one.

Possible topics include:

  • Microarrays
  • Gene expression
  • Proteomics
  • Metabolomics

  • Data mining
  • Machine learning
  • Probabilistic modeling
  • Information-theoretic modeling

  • Pattern discovery
  • Component models
  • Clustering
  • Bayesian approaches
  • Kernel methods
  • Spectral methods

Goal:

Three subgoals:

  1. Learn some basics and get an overview of the field.
  2. Get new ideas and insights through cross-methodological discussion and literature study.
  3. Get feedback on one's own research (postgrads) or a possible research topic (newcomers).

Format:

One lecture/week, possibly given in pairs. An opponent will be allocated to each presentation.

Also people who do not need credit points are very welcome to attend and participate in the discussion.

Passing the course:

  1. Participate actively. Read the material beforehand.
  2. Give a presentation (about 1hour + 30min discussion). Email preliminary versions of the slides to the opponent one week before your presentation, and almost-final versions to everybody two days before, at the latest.
  3. Be an opponent once. The opponent is supposed to keep up the discussion when the others do not have questions.
  4. Write brief lecture notes of one presentation - an understandable textbook-level description of the main points in the presentation. Return them in an electronic form within one week from the presentation. The others may comment on the lecture notes in the beginning of the next session. The format can be anything; optional templates are available in the web (for instance here, and somewhat lengthy models here. A few pages are enough.).

Required prior knowledge:

Something about bioinformatics, and preferably about high-throughput measurement data as well.

Basics about data analysis and probabilistics.

Samuel Kaski
Last modified: Thu May 26 09:54:05 EEST 2005