University of Helsinki Department of Computer Science
 

Department of Computer Science

Department information

 

Genome wide database

The databases below, which were used for the CELL article, refer to outdated genome assemblies. Here we provide more up-to-date genome wide computations for more current genome assemblies. The exons are masked for all of these runs. The human NCBI34 sequence assembly can be found from ftp://ftp.ensembl.org/pub/release-22/human-22.34d/data/fasta/dna/

Data used in Hallikas, et.al. 2006

Here are the databases that were used in the article to store the genome wide alignments. The data is provided as a dump from a mysql database. The database schema is provided as an UML diagram. The gene_description table is also provided (data obtained from the ensembl).

Notes

Please note that the coordinates in the databases are with respect to certain genome assemblies. In particular, Human and Mouse coordinates are based on OUTDATED assemblies. Because of this, these database dumps are provided just to provide all data used in the Hallikas et.al. article. For all practical purposes, use more current data from our database server.

Transcription factor binding matrices

The 107 TF binding matrices used are provided in .tar.gz or .zip package. Most of the motifs are copied from the JASPAR database.
Kimmo Palin
Last modified: Mon Jun 5 16:59:04 EEST 2006