Overview of NeuroPred Outputs
Cleavage Prediction Diagram
Predicted Cleavage Results
Model Accuracy Statistics Output
Results for Individual Sequences
Results Across All Sequences
Model Accuracy Statistics
Area Under the ROC curve (AUC)
Obtain Mass of Predicted Peptides Output
A. Overview of NeuroPred
This document provides a description of the outputs of NeuroPred. A description and example usage of an earlier interface to NeuroPred (2006 NAR) was provided by Southey et al. (2006b) that varies slightly from this version. A newer version of NeuroPred, Test 2009, provides similar output but can use artifical neural network models described in Tegge et al. 2008 and Southey et al. 2008.
There are six major components to
the output; however, not all outputs are provided because specific
outputs depend both on the output selection task and certain
error message may be displayed for selected errors such as sequence
format errors or invalid values for the settings. Whenever possible, NeuroPred will continue to perform the
requested tasks using the default values.
- Navigation links are provided at the top of the output page to facilitate
access to the various components of the output page depending
on the Output Selection Task selected.
- Cleavage Prediction Diagram is always provided for both
interfaces and is provided for every Output
Selection Task except for the Print Probabilities
of Basic Sites only task.
- Predicted Cleavage Results is optionally selected in the
Simplified Options Interface and always provided in the
Advanced Options Interface. This output is provided for every Output Selection Task except for the Print Probabilities of Basic Sites only task.
- Model Accuracy Statistics Output is only provided for the Model Accuracy Statistics task and always
provided for both interfaces.
- Mass Prediction Output is only provided for the Obtain Mass of Predicted Peptides task and always provided for both interfaces.
The cleavage prediction diagram is provided for all Output
Selection Tasks except for Print Probabilities of Basic Sites
only task. Each sequence entered is automatically
converted to upper case and split into groups of a maximum of 50
amino acids. Each group is presented in sequence order as follows:
The first column, Sequence, denotes that first line is the
sequence and contains up to five blocks in which each block holds
a maximum of ten amino acids. Immediately below this line is
another line for each selected model, and the final line consists
of a consensus report of all models. This is repeated until the
sequence is completely shown. For the model and consensus lines, a
series of "s" is provided to indicate the signal sequence
determined by either the global default value or sequence specific
values. For each model, sites where the cleavage probability for
that model exceeded the threshold probability are denoted with the
letter "C" below the site while non-cleaved sites are designated
by a period ".". The consensus line is defined for each
site as "C" if at least one model predicted cleavage or "." if all
models did not predict cleavage. By default, the rules of Amare
et al. 2006 and Southey
et al. 2008 are implemented and the resulting redundant sites are
denoted by an 'r'. This symbol will not appear when the Ignore processing rules option in the Advanced
Options Interface is set to No.
The Cleavage Prediction Diagram below, and all other examples
provided herein, are generated using the Human Proglucagon
Sequence and the default NeuroPred settings with
the Known Motif and Mammalian models selected.
If the Display Cleavage Probabilities is set to 'Yes' in the Simplified Options
Interface or the Advanced Options Interface is used, then the Predicted Cleavage Results table is provided for all
valid sequences submitted. This table reports cleavage results for any site across
all selected models where at least one model predicted cleavage.
Under the Model Accuracy Statistics task, with valid
input, the cleavage information is provided for all known
cleavages reported by the user. The Predicted Cleavage
Results table for the Human Proglucagon sequence using the default NeuroPred settings with the Known
Motif and Mammalian models selected is
D. Model Accuracy
The Model Accuracy Statistics task
provides a series of outputs for each sequence and a
summary across all sequences. By default, the model accuracy
statistics are calculated only for basic amino acids, following the processing rules of Amare
et al. 2006 and Southey
et al. 2008. Consequently, different results will occur
under the Advanced Options Interface when changing these two
options from the default values:
- Ignore processing rules option:
Under the Advanced Options Interface, if Yes is selected, all of the 'redundant'
sites in the sequence will be used to compute the different
- Use basic sites for accuracy statistics option: If No is selected under the Advanced Options Interface,
complete sequence, including all non-basic sites that are usually
considered uncleaved, will be used to compute the different
1. Results for Individual
2. Results Across All Sequences
Three tables of accuracy statistics are calculated using the
information from all sequences:
Model Accuracy Statistic
The following statistics are calculated for each selected
model at threshold probabilities incremented from 0.1 to 0.9
across all the submitted sequences.
c. Area under the ROC curve
The area under the receiver-operator characteristic (ROC)
curve is a summary over all user-selected models. This
curve indicates the percentage of correct decisions where values
greater than 0.8 indicate excellent performance and values
under 0.7 indicate poor performance.
E. Obtain Mass of
Predicted peptides Output
When the Obtain Mass of Predicted peptides task
is selected, the original sequence is cleaved based on predicted
values for each model. The resulting peptides are extended by an
order of 2 by default, or by the Degree of peptide
extension value selected in the Advanced Options Interface
(described in the Degree
of peptide extension option), before the selected
post-translational modifications are applied. Only peptides where
the selected post-translational modifications have been
successfully applied are reported using small blue brackets (e.g. DFPEEVAIVEEL[Amide] denotes
amidation of the peptide DFPEEVAIVEELG). The average and
monoiostopic masses of the predicted peptides are calculated at
all stages such that masses are available for every combination of
PTM to address the possibility that some PTMs may be absent. Note
that the standard mass or molecular weight, not the MH+ or
M+H mass, is calculated. The results are presented in the Mass
of Predicted Peptides table:
Amare, A., Hummon, A.B., Southey, B.R.,
Zimmerman, T.A., Rodriguez-Zas, S.L., Sweedler, J.V., Bridging
neuropeptidomics and genomics with bioinformatics: prediction of
mammalian neuropeptide prohormone processing. J.
Proteome Res., 2006, 5, 1162-1167. Abstract.
Southey B.R., Amare A.,
Zimmerman T.A., Rodriguez-Zas S.L., Sweedler J.V., NeuroPred: a tool
to predict cleavage sites in neuropeptide precursors and provide the
masses of the resulting peptides. Nucleic Acids
Res., 2006b, 34 (Web Server issue), W267-272. Abstract.
Tegge, A.N., Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., Comparative Analysis of Neuropeptide Cleavage Sites in Human, Mouse, Rat, and Cattle. Mamm. Genome, 2008 , 19(2), 106-120. Abstract.
Southey, B.R., Hummon, A.B., Richmond, T.A., Sweedler, J.V., Rodriguez-Zas, S.L., Prediction of neuropeptide cleavage sites in insects. Bioinformatics, 2008, 24, 815-825. Full Text