AME - a motif enrichment analysis tool

Usage:

ame [options] <sequence file> <motif file>+

Description:

AME (Analysis of Motif Enrichment) scores a set of DNA sequences given a set of DNA-binding motifs, treating each position in the sequence as a the starting point of a possible binding event. AME supports a wide variety of methods for scoring motif enrichment, many methods of testing the scored motif enrichment for significance. By default, AME counts the number of cases where the p-value of the binding event for each motif is below a given threshold, and performs a Fisher exact test versus the number of binding events in a background sequence set to determine the p-value of the count for each motif. The background set is appended to the main sequence set in the input file, and the offset within the file where the background starts is specified on the command line.

Input:

<sequence file> is a collection of sequences in FASTA format.
<motif file> containing a list of motifs, in MEME format. More than one file can be specified.

Output:

AME writes to a directory, ame_out, unless a different directory name is specified on the command line. The output directory contains outputs in two formats: HTML and plain text, in files named respectively ame.html and ame.txt.

Options:

--o <dir name> - Specifies the output directory. If the directory already exists, the contents will not be overwritten.
--oc <dir name> - Specifies the output directory. If the directory already exists, the contents will be overwritten.
--method<fisher|mhg|4dmhg|ranksum|linreg|spearman> - Select the association function for testing motif enrichment significance. Note that linear regression and spearman rank correlation tests do not calculate p-values. Please use RAMEN if you desire to use linear regression with p-values.
--scoring<avg|max|sum|totalhits> - Method of scoring motif enrichment. Either average-odds, the maximum (single-site) odds, the sum of odds, or counting individual hits that are more significant than a threshold. If using totalhits mode, please see also--pvalue-threshold
--bgformat 0..2 - Source for determining background frequencies
0: uniform background
1: MEME motif file
2: Background file
--bgfile <bfile> - Read background frequencies from <bfile>. The file should be in MEME background file format. The default is to use frequencies based on the motif file or files. See also --bgformat
--length-correction - Correct for length bias: subtract expected hits. Default=no length correction.
--pvalue-threshold <float, default=2e-4> - Threshold to consider single motif hit significant.
--fix-partition - Number of positive sequences; the balance are used as the background.
--pvalue-report-threshold <float, default=1e-3> - Corrected p-value threshold for reporting a motif.
--rsmethod<better|quick> - Select whether to use a proper ranksum test (better) or a faster heuristic. Default is to use the proper test.
--poslist<fl|pwm> - For partition maximization, test thresholds on either X (pwm) or Y (fluorescence score). Default is fluorescence score. Only applies for partition maximisation and for the Ranksum test.
--log-fscores - For linear regression and spearman tests only: regress using ln(fluorescence score), rather than the score directly.
--log-pwmscores - For linear regression and spearman tests only: regress using ln(pwm score), rather than the score directly.
--normalise-affinity - Normalise motif scores so that motif scores can be compared directly. Only relevant for Spearman and Linear Regression tests, where p-values are not calculated.
--linreg-switchxy - Make the x-points fluorescence scores and the y-points PWM scores. Only relevant for Spearman and Linear Regression tests.
--fl-threshold<p-value> - Only used if --poslist is in use. Maximum fluorescence p-value to consider as a 'positive' when labelling positives. Default is 1e-3. Use for Fisher Test with either SUM, AVG or MAX scoring only.
--pwm-threshold<score> - Minimum PWM score to call a sequence a 'positive'. Default is 1. Use for Fisher Test with either SUM, AVG or MAX scoring only.
--verbose <1...5> - Integer describing verbosity (low number is less verbose). Best placed first.
--help - Print a usage statement.

The default output directory is ame_out, and is not overwritten (i.e., the default is the same as --o ame_out).

Notes

This version of AME has different default settings to the original AME. By default, it will perform the Fisher exact test, and count individual motif hits (total hits mode). It will perform partition maximisation by default, but using a fixed partition between foreground and background sequences is likely to the be the preferred mode of operation. fix-partition in this version is optional, rather than required.

Citing ame

If AME is of use to you in your research, please cite:

Robert C. McLeay, Timothy L. Bailey (2009).
"Motif Enrichment Analysis: a unified framework and an evaluation on ChIP data."
BMC Bioinformatics 2010, 11:165, doi:10.1186/1471-2105-11-165.

Contact the authors

You can contact the authors via email:

Robert McLeay r.mcleay@imb.uq.edu.au, and Timothy Bailey t.bailey@imb.uq.edu.au.

Bug reports should be directed to Robert McLeay.