NAME

MOODS - Perl extension for finding significant matches of position weight matrices.

Subroutines

search

  Title   : search
  Usage   : @results = MOODS::search(seq => Bio::Seq(..), -matrix =>[[1,0],[0,1]] -threshold => 0.1)
  Function: Finds position weight matrix matches in dna sequence. 
  Returns : An array of references to result arrays. There are one result array
            corresponding to each matrix. (matrix1_results, matrix2_results,..)
            Each result array is a list of positions and scores like: 
            (pos1, score1, pos2, score2 ...) 
  Args    : 
           Obligatory
            -seq  BioPerl sequence object
            -matrix or -matrices
                 A matrix or a list of matrices. One matrix is represented 
                 as a typical perl multidimensional array: a reference to array of 
                 references to arrays of numbers, corresponding to the frequencies
                 or scores of the nucleotides A, C, G and T, respectively
            -threshold or -thresholds
                 A number or a list of numbers, used as threshold values for
                 matrix scanning.  If a single number is given, it is used
                 for all matrices; otherwise, there should be as many
                 threshold values as there are matrices.

           Optional
            -bg  Background distribution - an array of four doubles. If neither
                 -bg or -flatbg is given, the background is estimated from
                 the sequence. 
            -flatbg
                 If 1, the background distribution is set to a distribution
                 giving equal probability to all characters. Not compatible
                 with -bg. If neither -bg or -flatbg is given, the background
                 is estimated from the sequence. 
            -count_log_odds
                 If 1, assumes that the input matrices are frequency or
                 count matrices, and converts them to log-odds scoring
                 matrices; otherwise, treat them as scoring matrices.
                 Default 1.
            -threshold_from_p
                 If 1, assumes that thresholds are p-values and computes
                 the corresponding absolute threshold based on the matrix;
                 otherwise the threshold is used as a hard cut-off.
                 Default 1.
            -log_base
                 Base for logarithms used in log-odds computations. Relevant
                 if using -convert_log_odds => 1 and -threshold_from_p => 0.
                 Defaults to natural logarithm if parameter is not given.
            -pseudocount
                 Pseudocount used in log-odds conversion and added to
                 sequence symbol counts when estimating the background
                 from sequence. Default 1.

           Tuning parameters:
            (Optional, do not affect the results, but can give minor
             speed-ups in some cases. You can pretty much ignore these.)
            -algorithm  Selects the algorithm to use for scanning
                 "naive" naive algorithm
                 "pla" permutated lookahead algorithm
                 "supera" super alphabet algorithm. 
                   - Good for long matrices (> 20)
                 "lf" lookahead filtration algorithm. 
                   - Default algorithm in most cases.
                   - Sequence can be searched with multiple matrices 
                     simultaneously. 
                   - You should use this when you have large amount of matrices.
            -q   An integer, used for fine-tuning "supera" and "lf" algorithms.
                 The default value 7 should be ok pretty much always, but can 
                 be tuned to possibly slightly increase performance. 
            -combine
                 determines whether "lf" algorithm combines all
                 matrices to a single scanning pass. 
            -buffer_size

SYNOPSIS

  use Bio::Perl;
  use Bio::Seq;
  use MOODS;
  use MOODS::Tools qw(printResults);
  
  #we need a position weight matrix
  my $matrix = [ [10,0,0],
                 [0,10,0],
                 [0,0,10],
                 [10,10,10]];
  
  #we need also a bioperl sequence object
  my $seq = Bio::Seq->new(-seq              => 'actgtggggacgtcagtagcaggcatag',
                          -alphabet         => 'dna' );
                          
  my @results = MOODS::search(-seq => $seq, -matrix => $matrix, -threshold => 0.3);
  
  printResults($results[0]);

SEE ALSO

BioPerl documentation.

AUTHOR

Petri J Martinmaki, Janne H Korhonen

COPYRIGHT AND LICENSE

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/