Biological Sequence Analysis exercises
Exercise session 1 (Tuesday 23.1.) :
- 1.2 from [1]
- 1.4 from [1]
- Prove (or disprove) θML = θMAP with an uninformative prior
- 2.1 from [1]
- 2.4 from [1]
Exercise session 2 (Tuesday 30.1.) :
- 2.5 from [1]
- 2.8 from [1]
- 2.9 from [1]
- Explain how the PAM250 scoring matrix has been constructed (sect 2.8 from [1])
- Explain how the BLOSUM50 and BLOSUM62 scoring matrices have been constructed (sect 2.8 from [1])
- Describe the linear space alignment algorithm (sect. 2.6 from [1])
- 3.1 from [1]
Exercise session 3 (Tuesday 6.2., starts immediately after the lecture at 16:05) :
- 3.2 and 3.3 from [1]
- 3.5 from [1]
- 3.8 from [1]
- 3.10 and 3.11 from [1]
- Durbin et al. book, on page 78, describes a trick
to add log transformed probability values fast and
approximately correctly.
- Explain this trick
- How one can generalize the trick for more
than two numbers to be added?
- Try the BLAST search engine (www.ncbi.nlm.nih.gov/BLAST/).
-
search sequence GAATTCCAATAGA with blastn from the
Yeast database (try database "nr", and choose from options
"or select from" latin name "Saccharomyces cerevisiae[ORGN]");
-
search the sequnces given in Fig. 2.1 of Durbin using
PSI-BLAST from SWISSPROT database. Select values for
the sensitivity parameters such that you really find something.
Exercise session 4 (Tuesday 13.2., starts immediately after the lecture at 16:05) :
- 4.1 from [1] (page 86)
- 4.2 from [1] (page 86)
- Describe the structure of a pair-HMM that corresponds
to the gap model with *linear* gap penalties.
- One wants to list (global) pairwise alignments of two
sequences in descending order of the score of the alignment,
starting from the highest scoring alignment. How can you do
this? Sketch an algorithm.
- Try the HMMER tool (http://hmmer.janelia.org/).
Produce a profile-HMM from the alignment given in
Figure 5.3 of Durbin. Is the resulting HMM any good?
(More guidance of using HMMER available here.)
Exercise session 5 (Tuesday 20.2.) :
- Estimate for the HMM in Fig 5.4.
the transition probabilities
from state M3 to state I3 and
from state M3 to state D4 using
the alignment of Fig 5.3.
- What is the value given to S1 when the MAP model construction
algorithm is applied on the alignment of Fig. 5.7?
- 6.1 from [1] (page 142)
-
Explain how the MSA algorithm works (Durbin pp. 142-143 [1])
-
Explain how simulated annealing can be used with
the BW-algorithm (Durbin pp. 155-156 [1])
-
Try CLUSTAL program for constructing multiple alignments (more guidance
www.cs.helsinki.fi/u/prastas/clustal.html)
Exercise session 6 (Tuesday 27.2.) :
-
7.1 from [1] (page 164)
-
7.6 from [1] (page 168)
-
7.8 from [1] (page 169)
-
7.11 from [1] (page 176)
-
Apply the Neighbour-joining algorithm to construct a phylogenetic
tree for a dataset that has 4 nodes and their pairwise distances are
taken as the additive distances between the four leaves in Figure 7.7
of [Durbin], but such that the length of the edge that leads to
leaf 4 is 0.6 (instead of 0.4).
Pasi Rastas
Last modified: Tue Feb 20 14:17:28 EET 2007