Single-sequence submissions to PREP-Mt or PREP-Cp
PREP-Mt and PREP-Cp require that the user provides information for five parameters.
A brief description of these parameters is given below.
- Enter a protein-coding DNA sequence.
- It must be a plant mitochondrial or plastid protein-coding gene.
Noncoding regions, such as intergenic spacers, introns, and 5'- or 3'-UTRs
should be removed prior to analysis.
- It must use IUPAC symbols, which includes the standard A, C, G, and T
(or U), plus the ambiguity characters R, Y, M, K, W, S, B, D, H, V, and N.
- It must be less than 10000 nucleotides in length. This is sufficient for any
known protein-coding gene in plant mitochondria and chloroplasts.
- Choose the codon position of the first nucleotide.
- You will need to know whether the first nucleotide of the input
sequence is in the first, second, or third position of the codon.
- Note that this is not always the same as the reading frame of
the input sequence. For example, a sequence that is in the second
reading frame starts at the third codon position.
- Enter a name for the input DNA sequence.
- Must be less than 30 characters in length.
- Suggested names include the scientific name of the sequence you provided
or some other unique identifier.
- Which gene is it?
- PREP-Mt and PREP-Cp need to know the gene identity of the input sequence.
- Choose the correct name from the list. For PREP-Cp, not every plastid gene is
listed. Only genes with known edit sites are available for prediction.
- Provide a minimum cutoff value (C).
- If specified, PREP-Mt and PREP-Cp will not report any predicted edit sites
with a score lower than the cutoff value.
- The cutoff value must be between 0 and 1.
- The cutoff value allows you to control the relative proportion of
false positive and false negative prediction. A low cutoff value will find
more true edit sites, but will also increase the chance of misidentifying
an unedited site as edited. In contrast, a high cutoff value will reduce the
chance of misidentification, but will also miss more true edit sites.
- PREP-Mt works best using a cutoff value between 0 and 0.6, with a slight
optimum at C=0.2. PREP-Cp demands a higher cutoff, working optimally at
C=0.8 or even C=1.0. These values were shown to be optimal for many genes
from a broad range of angiosperms and gymnosperms, but they may not be optimal
in all cases. See the publications for more details.
PREP-Mt and PREP-Cp will return information regarding the position, effect, and score of
each predicted edit site. Downloadable files containing the prediction results and the
predicted sequences after editing are also provided in the "Download Data Files" section
of the output. A brief description of the output is given below.
- Nt Pos - The location of the nucleotide predicted to be edited in
the input DNA sequence.
- AA Pos - The location of the amino acid predicted to be edited in
the translation of the input DNA sequence.
- Align Col - The column in the AA alignment where the edit site
- Shows the codon and the encoded amino acid before and after editing.
- Format: unedited codon (unedited AA) => edited codon (edited
- The prediction score is a value ranging from 0 to 1.
- It is a rough indicator of the confidence of prediction.
- A higher value indicates more confidence.
- Download data files
- The prediction results table can be downloaded as a tab-delimited text
- The alignment used for edit site prediction can be downloaded as an
alignment file in clustal format.
- The predicted input sequence after editing can be downloaded in FASTA
- The protein translation of the edited input sequence can also be
downloaded in FASTA format.
Batch submissions to PREP-Mt or PREP-Cp
The batch submission option for PREP-Mt and PREP-Cp allows the user to obtain predictions for
multiple sequences at once. You will be required to submit a tab-delimited text file containing
all the necessary information (format described below). Output is similar to the single-sequence mode,
except that the downloadable data files for all sequences are combined and stored as compressed archives.
Tab-delimited text file
The input file for batch mode is a tab-delimited text file that contains the following five
parameters separated by a tab, with each sequence to be tested on a separate line:
- A sequence name (must be unique and less than 30 characters).
- The gene name (must be identical to one of the gene names listed in single-sequence mode.)
- The codon position of the first nucleotide (must be 1, 2, or 3).
- The cutoff value (must be between 0 and 1).
- The DNA sequence (must be IUPAC symbols only and less than 10 kb in length).
Downloadable data archives
The downloadable data files for all examined sequences are compressed and archived together in two
formats: a gzipped tar arhive (.tar.gz) and a zipped archive (.zip). Choose the format that is best
for your operating system.
Alignment submissions to PREP-Aln
PREP-Aln allows the user to use the PREP methodology to predict sites of RNA editing in a custom
alignment containing a mix of DNA and RNA sequences. The RNA sequences will be translated to form
the alignment that guides prediction in the DNA sequences. Output is similar to the other methods.
This feature is provided to allow the user to predict edit sites in any alignment of interest.
Using PREP-Aln, the user could take advantage of new or unpublished editing data or to optimize
To use PREP-Aln, the user must provide a codon-based nucleotide alignment in FASTA format,
consisting of protein-coding sequences aligned by codon. Gaps must be placed between codons
and gap lengths must be in multiples of three. To ensure propor formatting, placement of gaps
in the codon alignment should be guided by an alignment of the translated protein sequences.
Online resources, such as PAL2NAL and
RevTrans are available to
help generate codon-based alignments.
The alignment submitted to PREP-Aln must contain at least one RNA sequence and and any number
of DNA sequences. More RNA sequences will likely improve performance. To ensure recognition by PREP-Aln,
all RNA sequences in the alignment must be flagged by adding "_RNA" to the end of their definition
lines. Please use this sample alignment as a guide for
Mower, J. P. (2005). PREP-Mt: predictive RNA editor for plant mitochondrial genes.
BMC Bioinformatics, 6:96.
Mower, J. P. (2009). The PREP Suite: Predictive RNA editors for plant mitochondrial genes, chloroplast
genes and user-defined alignments. Nucl. Acids Res., 37:W253-W259.