MOODS: Motif Occurrence Detection Suite
MOODS is a suite of algorithms for matching position weight matrices (PWM) against DNA sequences. It features advanced matrix matching algorithms implemented in C++ that can be used to scan hundreds of matrices against chromosome-sized sequences in few seconds.
MOODS has been designed to be used as a library wherever PWM matching is needed. It can be used as standalone analysis tool or as a component in larger programs. It contains interfaces for BioPerl and Biopython toolkits. MOODS can thus be easily called from C++, Python and Perl programs.
MOODS is dual-licenced under GPL version 3 license and Biopython license.
Contact
The algorithms have been implemented by Pasi Rastas and Janne Korhonen. The Perl and Python interfaces have been written by Petri Martinmäki. The project is currently maintained by Janne Korhonen.
News
- 15.12.2009: MOODS 1.0.1 released. This version fixes some minor issues reported by users. MOODS is also now dual-licensed under GPL 3 and Biopython license.
Downloads
MOODS package contains PWM search algorithm implementations in C++ and interfaces for Perl and Python languages.
- MOODS version 1.0.1, full package
References
When using MOODS in your work, please cite
J. Korhonen, P. Martinmäki, C. Pizzi, P. Rastas and E. Ukkonen. MOODS: fast search for position weight matrix matches in DNA sequences. Bioinformatics 25(23), pages 3181-3182. (2009)
A detailed derivation and empirical comparison of the algorithms of MOODS is presented in paper
C. Pizzi, P. Rastas and E. Ukkonen: Fast search algorithms for position specific scoring matrices. IEEE/ACM Transactions on Computational Biology and Bioinformatics. In press. (manuscript)
Using MOODS
Introduction to MOODS
Position weight matrices (PWMs), also known position specific scoring matrices or weighted patterns, are a simple yet powerful model for sequence signals used in bioinformatics. They can be, for example, used to model transcription factor binding sites in DNA or other binding sites.
These PWMs are obtained usually from empirically observed instances of binding sites by counting occurences of symbols in different positions. For example, PWM could look like this (example from JASPAR database):
These matrices are typically converted into so-called log-odds-matrices that define a score against a DNA sequence in an obvious way. Sequences that score well against a matrix are likely to be instances of the signal modelled by the matrix, thus finding such subsequences from longer DNA sequences can be used in various analysis and prediction tasks.
MOODS is a software package for this matrix matching problem. It uses variants of classical string matching algorithms to rapidly scan the target sequence to find subsequences that score more than given threshold.
More detailed explanation of the matrix matching problem and the scoring model used by MOODS can be found in the paper mentioned above. MOODS interfaces automatically convert the given PWMs into log-odds scoring matrices if p-values are used. If absolute threshold is specified, the current interfaces treat the input matrices as scoring matrices and log-odds-conversion is not done.
Usage
Basic installation
Download MOODS.tar.gz above and extract it.
Under the MOODS directory, the C++ algorithm implementation library is under src directory, Perl and Python interfaces under their respective directories, and examples directory contains example scripts.
You need to compile the C++ library before installing the Perl and Python interfaces. This can be done by simply running make in the src directory. You can link your own C++ programs with this library; all the necessary declarations are in the header file pssm_algorithms.hpp. There is also a command line tool for basic tasks (see src/find_pssm_dna_readme.txt).
Perl extension
Installation
MOODS Perl interfaces depend on BioPerl, so you need to have it installed. On many Linux distributions, it is directly available from the package management system.
You can use make to compile the Perl interfaces:
If you want to use non-standard installation path, use command
For more details see perl documentation here.
Installing may need administrator privileges:
Examples
Following examples demonstrate the basic usage of MOODS Perl interfaces. You can find the same examples also from MOODS package.
- Basic search A simple example of basic use of MOODS
- Multiple matrices Shows how to search multiple matrices at the same time
- Loading data from files Data is loaded from files.
- Window search A more complicated example. Finds from a given DNA sequence a window of given width such that this window has large enough number of good enough binding sites for a given set of matrices.
Documentation
- MOODS The main search function.
- MOODS::Tools Tools for handling search data.
Python extension
Installation
The setup.py script is used to install the MOODS python interface. It requires Python.h C headers, so you may have to install Python development packages before compiling the interface.
Examples
Some examples of Python interface usage. These are again also included in MOODS package.
- Base A simple example of basic use of MOODS
- Multiple matrix Shows you how to search multiple matrices at the same time
Documentation
- MOODS This module contains an interface to PWM search algorithms.
