Usage:
ame [options] <sequence file> <motif file>+
Description:
AME
(Analysis of Motif Enrichment)
scores a set of DNA sequences given a set of DNA-binding motifs,
treating each position in the sequence as a the starting point of a
possible binding event. AME supports a wide variety of methods for
scoring motif enrichment, many methods of testing the scored motif enrichment for
significance. By default, AME counts the number of cases where the
p-value of the binding event for each motif is below a given
threshold, and performs
a Fisher exact test versus the number of binding events in a background
sequence set to determine the p-value of the count for each
motif. The background set is appended to the main sequence set in the
input file, and the offset within the file where the background starts
is specified on the command line.
Input:
-
<sequence file>
is a collection of sequences in FASTA format. -
<motif file>
containing a list of motifs, in MEME format. More than one file can be specified.
Output:
AME
writes to a directory, ame_out
, unless a different
directory name is specified on the command line. The output directory contains outputs
in two formats: HTML and plain text, in files named respectively ame.html
and
ame.txt
.
Options:
--o <dir name>
- Specifies the output directory. If the directory already exists, the contents will not be overwritten.--oc <dir name>
- Specifies the output directory. If the directory already exists, the contents will be overwritten.--method<fisher|mhg|4dmhg|ranksum|linreg|spearman>
- Select the association function for testing motif enrichment significance. Note that linear regression and spearman rank correlation tests do not calculate p-values. Please use RAMEN if you desire to use linear regression with p-values.--scoring<avg|max|sum|totalhits>
- Method of scoring motif enrichment. Either average-odds, the maximum (single-site) odds, the sum of odds, or counting individual hits that are more significant than a threshold. If usingtotalhits
mode, please see also--pvalue-threshold
--bgformat 0..2
- Source for determining background frequencies0: uniform background
1: MEME motif file
2: Background file--bgfile <bfile>
- Read background frequencies from<bfile>
. The file should be in MEME background file format. The default is to use frequencies based on the motif file or files. See also--bgformat
--length-correction
- Correct for length bias: subtract expected hits. Default=no length correction.--pvalue-threshold <float, default=2e-4>
- Threshold to consider single motif hit significant.--fix-partition
- Number of positive sequences; the balance are used as the background.--pvalue-report-threshold <float, default=1e-3>
- Corrected p-value threshold for reporting a motif.--rsmethod<better|quick>
- Select whether to use a proper ranksum test (better
) or a faster heuristic. Default is to use the proper test.--poslist<fl|pwm>
- For partition maximization, test thresholds on either X (pwm) or Y (fluorescence score). Default is fluorescence score. Only applies for partition maximisation and for the Ranksum test.--log-fscores
- For linear regression and spearman tests only: regress using ln(fluorescence score), rather than the score directly.--log-pwmscores
- For linear regression and spearman tests only: regress using ln(pwm score), rather than the score directly.--normalise-affinity
- Normalise motif scores so that motif scores can be compared directly. Only relevant for Spearman and Linear Regression tests, where p-values are not calculated.--linreg-switchxy
- Make the x-points fluorescence scores and the y-points PWM scores. Only relevant for Spearman and Linear Regression tests.--fl-threshold<p-value>
- Only used if--poslist
is in use. Maximum fluorescence p-value to consider as a 'positive' when labelling positives. Default is 1e-3. Use for Fisher Test with either SUM, AVG or MAX scoring only.--pwm-threshold<score>
- Minimum PWM score to call a sequence a 'positive'. Default is 1. Use for Fisher Test with either SUM, AVG or MAX scoring only.--verbose <1...5>
- Integer describing verbosity (low number is less verbose). Best placed first.--help
- Print a usage statement.
The default output directory is ame_out
, and is
not overwritten (i.e., the default is the same as --o ame_out
).
Notes
This version of AME has different default settings to the original AME.
By default, it will perform the Fisher exact test, and count individual
motif hits (total hits mode). It will perform partition maximisation by
default, but using a fixed partition between foreground and background
sequences is likely to the be the preferred mode of operation.
fix-partition
in this version
is optional, rather than required.
Citing ame
If AME is of use to you in your research, please cite:
Robert C. McLeay, Timothy L. Bailey (2009).
"Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data."
BMC Bioinformatics 2010, 11:165, doi:10.1186/1471-2105-11-165.
Contact the authors
You can contact the authors via email:
Robert McLeay r.mcleay@imb.uq.edu.au, and Timothy Bailey t.bailey@imb.uq.edu.au.
Bug reports should be directed to Robert McLeay.