Genome-Scale Algorithm Design, 2nd edition

Welcome to the website of the book “Genome-Scale Algorithm Design”.
This page is about the 2nd edition. Information about the 1st edition is no longer updated, but is kept archived.

High-throughput sequencing has revolutionized the field of biological sequence analysis. Its application has enabled researchers to address important biological questions, often for the first time. This book provides an integrated presentation of the fundamental algorithms and data structures that power current sequence analysis workflows.

The topics covered range from the foundations of biological sequence analysis (alignments and hidden Markov models), to classical index structures (k-mer indexes, suffix arrays and suffix trees), Burrows-Wheeler indexes, graph algorithms, and a number of advanced omics applications.

The 2nd edition strengthens the toolkit by covering minimizers and other advanced data structures and their use in emerging pangenomics approaches.

The chapters feature numerous examples, algorithm visualizations, exercises and problems, each chosen to reflect the steps of large-scale sequencing projects, including read alignment, variant calling, haplotyping, fragment assembly, alignment-free genome comparison, transcript prediction, and analysis of metagenomic samples. Each biological problem is accompanied by precise formulations, providing graduate students and researchers in bioinformatics and computer science with a powerful toolkit for the emerging applications of high-throughput sequencing.

Target audience:

Graduate students in computer science with a strong interest in molecular biology. The book presents all the required biological concepts in a minimalistic, combinatorial way, omitting the description of most biochemical processes and focusing on inputs and outputs, abstracted as mathematical objects.
Graduate students in bioinformatics.
Bioinformatics practitioners willing to master the algorithmic foundations of the field.

Highlights:

Features many examples, algorithm visualizations, problems, and end-of-chapter exercises.
Describes just the minimum setup of data structures necessary to understand more advanced concepts, so that students are not burdened with technical results and can focus on more conceptual algorithm design questions.
Highlights in dedicated frames a number of techniques, and of mathematical and statistical derivations, that can be of immediate use for bioinformatics practitioners.

The 2nd edition is now available in several bookstores, such as:

Cambridge University Press
Amazon.com, Amazon.co.uk
An electronic version is available at eBooks.com

Veli Mäkinen is a Professor of Computer Science at the University of Helsinki, Finland, where he heads a research team working on Genome-scale algorithmics. He has taught advanced courses on algorithm design and analysis, string processing, data compression, algorithmic genome analysis, along with introductory courses on bioinformatics.

Djamal Belazzougui is a permanent researcher at the Research Centre for Scientific and Technical Information in Algiers, Algeria. He worked in the Genome-scale algorithmics Group at the University of Helsinki as a postdoctoral researcher from 2012 to 2015. His research topics include hashing, succinct and compressed data structures, string algorithms, and bioinformatics.

Fabio Cunial is a computational scientist at the Broad Institute of MIT and Harvard. He has served as a postdoctoral researcher at the Genome-scale algorithmics group at the University of Helsinki as well as the Myers lab in Dresden.

Alexandru I. Tomescu is an Associate Professor of Algorithmic Bioinformatics at the University of Helsinki, Finland, where he leads a team working on graph algorithms and their application to high-throughput sequencing problems. He is a recipient of a Starting Grant by the European Research Council.

“This book is very effective in addressing its intended target of graduate students in bioinformatics or computer science with a well-structured but highly accessible description of the fundamental algorithms and data structures that power standard sequence analysis workflows. ”

Romeo Rizzi, University of Verona, Italy.

Check also quotes about the 1st edition.