The synthetic DNA data in: Jouni Sirén, Niko Välimäki, Veli Mäkinen, and Gonzalo Navarro: Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections. In 15th Symposium on String Processing and Information Retrieval (SPIRE 2008), Springer-Verlag LNCS 5280, pp. 164-175, Melbourne, Australia, November 10-12, 2008. The compiler used in the experiments was g++ (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33) with a 32-bit Intel target. The random number generator should be the same for any GCC 4.* on any 32-bit platform. File dna.50MB.gz contains the 50 MB prefix of the DNA collection from Pizza & Chili corpus. Decompress it and use mutator to generate the data sets. For example, mutator dna.50MB output 4 25 0.003 writes the 25 x 4 MB data set at mutation rate 0.003 to file output. Mutation rates used in the experiments were the following: 0.000 0.001 0.003 0.005 0.010 0.015 0.020 0.030 0.040 0.050