Seminar: High-throughput Sequencing Data Analysis

58312301
3
Bioinformatics
Advanced studies
Year Semester Date Period Language In charge
2013 spring 14.01-22.04. 3-4 English Leena Salmela

Lectures

Time Room Lecturer Date
Mon 14-16 C220 Leena Salmela 14.01.2013-18.02.2013
Mon 14-16 C220 Leena Salmela 11.03.2013-22.04.2013
Mon 14-16 C220 Leena Salmela 29.04.2013-29.04.2013

General

The first session will be on Jan. 14th.

We have entered the post genomic era as the genomes of many species have been sequenced. But having the genome sequence is not enough. We should also understand how the genome works. Currently big international projects, like ENCODE, are attempting to answer this question using high throughput sequencing.

The first step in analyzing high-throughput sequencing data is often mapping the reads to a reference. Mapping is only a starting point and the aim of this seminar is to get familiar with the data analysis methods beyond mapping. These issues include variation calling (identifying SNPs, structural variation, copy number variation...), the analysis of epigenetic data (methylation studies, histone modifications) and the analysis of metagenomic data. Below are some possible topics to choose from. Students are also welcome to propose their own topics.

Topics

  1. Epigenetics
    1. DNA Methylation (Marcus, peer reviewers: Chengyu, Ni)
    2. Histone modifications (Chengyu, peer reviewers: Marcus, Elina)
  2. Metagenomics
    1. Determining species composition (Elina, peer reviewers: Ni, Alejandra)
  3. Assembly
    1. Haplotype assembly (Roman, peer reviewers: Elina, Diyu, Chengyu)
    2. Utilizing single molecule sequencing technologies in de novo assembly (e.g. PacBio) (Diyu, peer reviewers: Roman, Chengyu)
  4. RNAseq
    1. Gene expression analysis (Kristiina, peer reviewers: Alejandra, Marcus)
    2. Inferring transcripts (Alejandra, peer reviewers: Kristiina, Roman)
  5. Variation calling
    1. SNP calling (Ni, peer reviewers: Kristiina, Diyu, Roman)
    2. Structural variation
    3. Copy number variation

Schedule

  • 14.1. Introduction. Choosing topics and presentation dates
  • 21.1. and 28.1. One-to-one meetings with students
    • Before this meeting prepare a short (at most one paragraph of text) description of the topic and find possible references to be studied
    • Kristiina: 21.1. in C220 14:15
    • Elina 21.1. in C220 15:30
    • Diyu 28.1. in B230 11:30
    • Roman: 28.1. in C220 14:15
    • Marcus: 28.1. in C220 14:45
    • Alejandra 28.1. in C220 15:15
    • Ni 29.1. in B230 13:00
  • 18.2. Deadline for the first draft of the report
    • The first draft should have at least 3-4 pages of text
    • Email the draft to the instructor and your peer reviewers
  • 25.2. Comments to the first draft
    • Email the comments to the instructor and the author of the report
  • Presentations in the 4th period:
    • 11.3. (not any presentations)
    • 18.3. Elina
    • 25.3. Roman
    • 8.4. Ni and Chengyu
    • 15.4. Cancelled
    • 22.4. Diyu and Marcus
    • 29.4. Alejandra and Kristiina

The final report is due on Thursday before the presentation (email to all seminar participants).

Instructions on Writing the Report

A typical report would have:

  • An introduction
  • Problem definition. What is the problem that the tool/method/algorithm solves? How is it motivated by biology?
  • A brief overview of related methods. Don't overdo this. Just mentioning the most relevant related methods is enough.
  • Explanation of how the tool/method/algorithm solves the problem.
  • A brief summary of results presented in literature. How good is the tool/method/algorithm?

The length of the report should be 6-8 pages including references, illustrations etc.

Instructions on Preparing the Presentation

The presentation should give an overview of your topic. Note that you do not have time to explain all the details so try to concentrate on what is essential.

You should prepare slides for your presentation. There is a computer in the classroom that can be used for showing the slides. You can also bring your own laptop.

The length of the presentation should be 30 minutes plus 10 minutes for questions.

Completing the course

The seminar consists of the following parts:

  • Write a report of 6-8 pages on a chosen topic
  • Give a presentation on the topic
  • Give written feedback to two other students on their report
  • Attend at least 80% of the seminar sessions

Literature and material

  • Review on sequencing technologies:

M.L. Metzker: Sequencing technologies - the next generation. Nature Reviews Genetics 11:31-46, 2010.

  • Review on third generation sequencing technologies:

E.E. Schadt, S. Turner, A. Kasarskis: A window into third-generation sequencing. Human Molecular Genetics 19(R2):R227-R240, 2010.