Algodan"


MOODS: Motif Occurrence Detection Suite

MOODS is a suite of algorithms for matching position weight matrices (PWM) against DNA sequences. It features advanced matrix matching algorithms implemented in C++ that can be used to scan hundreds of matrices against chromosome-sized sequences in few seconds.

MOODS has been designed to be used as a library wherever PWM matching is needed. It can be used as standalone analysis tool or as a component in larger programs. It contains interfaces for BioPerl and Biopython toolkits. MOODS can thus be easily called from C++, Python and Perl programs.

MOODS is dual-licenced under GPL version 3 license and Biopython license.

Contact

The algorithms have been implemented by Pasi Rastas and Janne Korhonen. The Perl and Python interfaces have been written by Petri Martinmäki. The project is currently maintained by Janne Korhonen.

News

Downloads

MOODS package contains PWM search algorithm implementations in C++ and interfaces for Perl and Python languages.

References

When using MOODS in your work, please cite

J. Korhonen, P. Martinmäki, C. Pizzi, P. Rastas and E. Ukkonen. MOODS: fast search for position weight matrix matches in DNA sequences. Bioinformatics 25(23), pages 3181-3182. (2009)

A detailed derivation and empirical comparison of the algorithms of MOODS is presented in paper

C. Pizzi, P. Rastas and E. Ukkonen: Finding Significant Matches of Position Weight Matrices in Linear Time. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 8(1), pages 69 - 79. (2011) (manuscript)

Using MOODS

Introduction to MOODS

Position weight matrices (PWMs), also known position specific scoring matrices or weighted patterns, are a simple yet powerful model for sequence signals used in bioinformatics. They can be, for example, used to model transcription factor binding sites in DNA or other binding sites.

These PWMs are obtained usually from empirically observed instances of binding sites by counting occurences of symbols in different positions. For example, PWM could look like this (example from JASPAR database):

A  [ 0  3 79 40 66 48 65 11 65  0 ]
C  [94 75  4  3  1  2  5  2  3  3 ]
G  [ 1  0  3  4  1  0  5  3 28 88 ]
T  [ 2 19 11 50 29 47 22 81  1  6 ]

These matrices are typically converted into so-called log-odds-matrices that define a score against a DNA sequence in an obvious way. Sequences that score well against a matrix are likely to be instances of the signal modelled by the matrix, thus finding such subsequences from longer DNA sequences can be used in various analysis and prediction tasks.

MOODS is a software package for this matrix matching problem. It uses variants of classical string matching algorithms to rapidly scan the target sequence to find subsequences that score more than given threshold.

More detailed explanation of the matrix matching problem and the scoring model used by MOODS can be found in the paper mentioned above. MOODS interfaces automatically convert the given PWMs into log-odds scoring matrices if p-values are used. If absolute threshold is specified, the current interfaces treat the input matrices as scoring matrices and log-odds-conversion is not done.

Usage

Basic installation

Download MOODS.tar.gz above and extract it.

$ tar -xvvzf MOODS.tar.gz

Under the MOODS directory, the C++ algorithm implementation library is under src directory, Perl and Python interfaces under their respective directories, and examples directory contains example scripts.

You need to compile the C++ library before installing the Perl and Python interfaces, as follows.

$ cd src
$ make

You can link your own C++ programs with this library; all the necessary declarations are in the header file pssm_algorithms.hpp. There is also a command line tool for basic tasks (see src/find_pssm_dna_readme.txt).

Perl extension

Installation

MOODS Perl interfaces depend on BioPerl, so you need to have it installed. On many Linux distributions, it is directly available from the package management system.

To compile the Perl interfaces, first make sure that you have compiled the C++ library as instructed above.Then proceed as follows.

$ cd perl
$ perl Makefile.PL
$ make

If you want to use non-standard installation path, use command

$ perl Makefile.PL PREFIX=/path/

For more details see perl documentation here.

Installing may need administrator privileges:

$ make install

Examples

Following examples demonstrate the basic usage of MOODS Perl interfaces. You can find the same examples also from MOODS package.

Documentation

Python extension

Installation

The setup.py script is used to install the MOODS python interface. It requires Python.h C headers, so you may have to install Python development packages before compiling the interface.

The python interface requires that you have compiled the C++ library as instructed above.

$ cd python
$ python setup.py install

Examples

Some examples of Python interface usage. These are again also included in MOODS package.

Documentation