Exercise session 4

Introduction to bioinformatics, Autumn 2008

Exercise sessions:

Remember to send your exercise notes to Lauri Eronen before your exercise session begins!

Assignments

  1. Search J = CGGTTCGTATCGTCG for matches to TTCG within one mismatch. First make a list of all possible matches. How many matches are there within a single mismatch neighborhood of TTCG? Hint: There is one exact pattern, and there are 3×4 = 12 single-mismatch patterns.

    1. Run nucleotide BLAST tool at NCBI against Reference mRNA sequence database using this sequence as the query sequence. Choose to Optimize for Somewhat similar sequences (blastn). Otherwise use default parameters.

      Explain the contents of the result page in your own words. How many matches did you get? How similar were the best matches to the query sequence? How long did the query take?

    2. Run protein BLAST tool against Non-redundant protein sequences (nr) database using this sequence as the query sequence. Discuss the results as in 2.1.

      (This assignment uses the same query sequence as a BLAST tutorial at NCBI, which is useful to go through)

  2. Find the cycle decomposition for permutation F = 3 1 5 2 7 4 6. Hint: reversal distance for this permutation is 6.

  3. Simulate the reversal sort and improved breakpoint reveral sort algorithms for the following permutation: 2 3 1 4 6 5 7. Show the increasing and decreasing strips, and the number of breakpoints left at each step.

  4. Familiarise yourself with GRIMM software and solve the following permutation with it: 8 2 7 6 5 1 4 3. Assume that this permutation corresponds to unsigned data from a linear chromosome.

  5. Calculate a genome rearrangement scenario for the human-mouse case presented in lectures, which demonstrated how homologs from six different mouse chromosomes can be found in human chromosome 6. The picture from lecture is below.