InvCoal
InvCoal is a simulator for simulating SNP data sets with inversion polymorphisms. The simulator is implemented in Java. The software is free to use. If used in a scientific publication, the appropriate reference will be the PhD thesis of Jussi Kollin, Computational Methods for Detecting Large-Scale Chromosome Rearrangements in SNP Data.
Download
Latest update to software: Sep 20, 2010.
The current build (20.09.2010) still has some bugs/lacking features. One definite bug is that the inversion length has to be shorter than the simulated segment length.
Some caveats why you might not want to use this:
- Limited population history options. The inversion population currently always undergoes exponential growth while the ancestral-type population has always a constant-sized population. No simulated selection here either.
- Memory hog; use of Java switch -Xmx1G (or comparable) is advised..
- Inaccurate modelling of the tetrad for placing the chiasmata. I have heard the model I used is incorrect (at least for Drosophila), but it might be close enough for the inaccuracy not to matter.
- Many real inversions in HapMap data sets do not really look like the simulation output, bar the 900 kb inversion in HapMap CEU data set in chromosome 17. This possibly due to the simulator assumption of the inversion being a unique event.
- Might still have bugs (has not undergone a thoroughly rigorous testing process).
- Maybe differring definition of recombination than is usual.
Current version (Sep 20, 2010): InvCoal.java. In the unnamed package; compilation: "javac InvCoal.java"
You will need Java 5 or 6 to compile and run this.
Documentation
Command-line parameters:- -Na [double]
- Ancestral-type diploid population size (constant through time)
- -Ni [double]
- Inversion-type diploid population size at present
- -age [double]
- Inversion age in generations. By using the exponential growth model, the inversion-type population reaches size 1 at this point.
- -sampleanc [int]
- How many ancestral-type samples are produced
- -sampleinv [int]
- How many inversion-type samples are produced
- -mu [double]
- Mutation rate per basepair
- -r [double]
- Recombination rate per basepair
- -len [int]
- Simulated sequence length in basepairs
- -seed [long]
- Random generator seed
- -inverted [double] [double]
- Inversion region span (both in must be in [0,1) )
- -counting [int]
- Counting model's interference parameter m (1 = no interference model or Poisson model)
- -geneconv [double] [double]
- Mean gene conversion tract length (bp), gene conversion initiation rate per bp
Contact
Software author e-mail: Jussi Kollin(this automatically generated address is valid for at most three days). Page last updated on Sep 20, 2010.