Hybrid SHREC

Hybrid SHREC is an error correction algorithm for correcting reads from various DNA sequencing platforms. We provide two versions of the algorithm: one that performs the error correction in base space and is limited to normal base space reads and another that performs error correction in the color space of the SOLiD sequencing technology by Applied Biosystems. The latter version can correct a mixed set of base space and color space reads.

The code builds on an earlier version of SHREC intended for SOLEXA/Illumina reads.

Sources and example files for the base space version of hybrid SHREC:

Sources and example files for the color space version of hybrid SHREC:

The above read sets are simulated data. They are generated from the Escherichia Coli genome K-12 substrain MG1655 (NC_000913). The genome length is about 4,700,000. The error rate of the reads is about 3 %. The coverage of the base space reads is 12 and the coverage of the color space reads is 30.