Exploration and retrieval of whole-metagenome sequencing samples

Event type: 
HIIT seminar
Event time: 
25.10.2013 - 10:15 - 11:00
Lecturer : 
Sohan Seth
Exactum, B119
Exploration and retrieval of whole-metagenome sequencing samples
Over the recent years, the field of whole metagenome shotgun sequencing has witnessed significant growth due to the next generation sequencing technologies that allow sequencing genomic samples cheaper, faster, and with better coverage than before. This technical advancement has initiated the trend of sequencing multiple samples in different conditions or environments to explore the similarities and dissimilarities of the microbial communities. Examples include the human microbiome project and various studies of the human intestinal tract. With the availability of ever larger databases of such measurements, finding samples similar to a given query sample is becoming a central operation. In this paper, we develop a content-based retrieval method for whole metagenome sequencing samples. We apply a distributed string mining framework to efficiently extract all informative sequence k-mers from a pool of metagenomic samples, and use them to measure the dissimilarity between two samples. We evaluate the performance of the proposed approach on two human gut metagenome data sets and observe significant enrichment for diseased samples in results of queries with another diseased sample.
About the Presenter
Sohan is a postdoctoral researcher at Aalto University and an active member of the HIIT-wide focus area in Augmented Science.
22.10.2013 - 11:37 Brandon Malone
22.10.2013 - 11:37 Brandon Malone