Neuropeptide prediction from genetic
information
Objective
As the number of genomes being sequenced continues to grow, so has the need to develop effective, accurate methods to predict neuropeptides from this ever-increasing genomic information. While neuropeptide precursor mRNA sequences can be identified from genetic information, producing the final peptides requires an extensive, complicated series of enzymatic processing-steps, such as prohormone cleavages and other post-translational modifications, making neuropeptide prediction both difficult and time-consuming. For example, only a small subset of the monobasic cleavage sites is used in most prohormones.
Can we predict which sites are cleaved and which ones are not?
We have successfully demonstrated that cleavage sites in several prohormones from a wide range of organisms, including various molluscan, insect and mammalian species, can be accurately predicted based on sequence alone. The NeuroPred application suite is the outcome of this research.
The overarching goal of the NeuroPred application suite is to predict the likely cleavage sites in a prohormone, and use this information to provide the most likely set of neuropeptides resulting from the enzymatic processing of the prohormone. Although it is probable that all predictions will not be correct, by optimizing the prediction of putative peptides from novel prohormones, research efforts can be guided to the most likely peptide candidates and mass spectrometric identifications of peptides becomes enhanced.
Obviously, the accuracy of a model's prediction of neuropeptide cleavage sites is in direct relation to the quality of the data used in training that model. Therefore, we are interested in increasing the amount of information available for our neuropeptide cleavage prediction to draw upon, particularly with regard to adding experimentally verified cleavage site data. If you are interested in contributing known data to our project, please email Jonathan Sweedler.
Neuropeptide Prediction at the UIUC Center for Neuroproteomics on Cell-Cell Signaling
How do we predict neuropeptide cleavages?
Hummon et al. (2003) first predicted cleavage sites in Aplysia californica precursors using a logistic regression model, demonstrating its application to several prohormones in a range of other organisms. In subsequent work, the UIUC Neuroproteomics Center extended this approach to mammalian precursors (Amare et al., 2006), and precursors identified from the Apis mellifera and Drosophila melanogaster genomes (Southey et al. 2008). Southey et al. (2006a) also introduced an empirically based, Known Motif model that provides a high positive prediction rate. Tegge et al. (2007, 2008) successfully applied this approach to the bovine precursors and found improved performance by including amino acid properties. While the different models have slightly different outputs, the ability to test the cleavages of a novel prohormone using a variety of models allows insight into the most likely cleavage sites.
In addition to the neuropeptide prediction work being done at the UIUC Neuroproteomics Center, Duckert et al. (2004) developed the ProP application (http://www.cbs.dtu.dk/services/ProP) to predict cleavage sites using an artificial neural network trained on a wide range of published eukaryotic and viral sequences. Other researchers have also derived various empirical approaches based on the observation of cleavage sites but these approaches have been found to be ineffective in predicting cleavage sites.
The NeuroPred Application Suite
NeuroPred (http://stagbeetle.animal.uiuc.edu/cgi-bin/neuropred.py) has been developed as a web-based tool to predict cleavage from the different logistic regression and the Known Motif models (Southey et al., 2006b). If known cleavage information is provided, NeuroPred can also provide various indicators of model accuracy described by Southey et al. (2006a,b). Southey et al. (2006b) provides a overview of the application and an example. For additional details on usage, see the Input and Output Documentation links on the NeuroPred application page.
By clicking on the links to use our site, the user agrees to acknowledge the use of NeuroPred provided by the UIUC Center for Neuroproteomics, and the appropriate modeling papers, in any publications that may result from the use of this information.
NeuroPred Models
Due to the range of species that we have studied, a variety of prediction models are available in NeuroPred. These models have been reported in the following publications: Hummon et al. (2003), Amare et al. (2006), Southey et al. (2006a, 2008) and Tegge et al. (2008). Which model(s) should you use? We recommend the prediction model selected be one that is derived from the species most closely related to the one you are studying as this will most likely provide the best prediction results. For example, the Mammalian model should be used for mammalian sequences, and the Drosophila or Apis models for insects. When obtaining true-positive results is more important than obtaining false-positive results, the Known Motif model is recommended - it provides more cleavage predictions than the other models.
Neuropeptide Sequences
A user can input one or more prohormone sequences following the formats described in the Input Documentation. In addition, the neuropeptide sequences used to train the models for our published studies are available. For these prohormones, the sequence and cleavage information is provided in a format that can be directly used in the NeuroPred application.
References
Amare, A., Hummon, A.B., Southey, B.R., Zimmerman, T.A., Rodriguez-Zas, S.L., Sweedler, J.V., Bridging neuropeptidomics and genomics with bioinformatics: prediction of mammalian neuropeptide prohormone processing. J. Proteome Res., 2006, 5, 1162-1167. Abstract.
Duckert, P., Brunak, S., and Blom, N., Prediction of proprotein convertase cleavage sites. Protein Eng. Des. Sel., 2004,17, 107-112. Abstract.
Hummon, A.B., Hummon, N.P., Corbin, R.W., Li, L.J., Vilim, F.S., Weiss, K.R., Sweedler, J.V., From precursor to final peptides: a statistical sequence-based approach to predicting prohormone processing. J. Proteome Res., 2003, 2, 650-656. Abstract.
Southey, B.R., Rodriguez-Zas, S.L., Sweedler, J.V., Prediction of neuropeptide prohormone cleavages with application to RFamides. Peptides, 2006a, 27, 1087-1098. Abstract.
Southey B.R., Amare A., Zimmerman T.A., Rodriguez-Zas S.L., Sweedler J.V., NeuroPred: a tool to predict cleavage sites in neuropeptide precursors and provide the masses of the resulting peptides. Nucleic Acids Res., 2006b, 34 (Web Server issue), W267-272. Abstract.
Tegge, A.N., Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., Enhanced prediction of cleavage in bovine precursor sequences. International Symposium on Bioinformatics Research and Applications, 2007.
Abstract.
Amare, A., Sweedler, J.V., Neuropeptide precursors in Tribolium castaneum. Peptides, 2007 28(6):1282-91. Abstract.
Southey, B.R., Hummon, A.B., Richmond, T A., Sweedler, J.V., Rodriguez-Zas, S.L., Prediction of neuropeptide cleavage sites in insects.Bioinformatics, 2008 24(6), 815-25.
Abstract.
Tegge, A.N., Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., Comparative Analysis of Neuropeptide Cleavage Sites in Human, Mouse, Rat, and Cattle. Mamm. Genome, 2008 , 19(2), 106-120.
Abstract.
Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., A Python analytical pipeline to identify prohormone precursors and predict prohormone cleavage sites. Front. Neuroinform. 2008 2, 7. Abstract.
Southey, B.R., Rodriguez-Zas, S.L., Sweedler, J.V., Characterization of the prohormone complement in cattle using genomic libraries and cleavage prediction approaches.
BMC Genomics. 2009 10:228.Abstract.
Delfino, K.R., Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., Genome-wide census and expression profiling of chicken neuropeptide and prohormone convertase genes.Neuropeptides. 2010 44:31-44. Abstract.
Xie, F., London, S.E., Southey, B.R., Annangudi, S.P., Amare, A., Rodriguez-Zas, S.L., Clayton, D.F., Sweedler, J.V., The zebra finch neuropeptidome: prediction, detection and expression. BMC Biol. 2010 8:28.Abstract.
Porter, K.I., Bruce R Southey, B.R., Sweedler, J.V., Rodriguez-Zas, S.L., First survey and functional annotation of prohormone and convertase genes in the pig. BMC Genom. 2012 Accepted as a companion paper to the swine genome.
TOP