Accurate Assignment of Significance to Neuropeptide Identifications using Monte Carlo k-Permuted Decoy Databases
A novel application of Monte Carlo permutation testing that improves the calculation of the neuropeptide match significance
in database search programs is demonstrated. Monte Carlo permutation tests were performed using a range of peptide
match indicators in OMSSA, Crux, and X!
Tandem. The approach was evaluated on 23 and 80 manually annotated neuropeptide
tandem mass spectrometry spectra with and without post-translational modifications, respectively. The tandem spectra were
searched against a target database of 618 neuropeptides obtained from the PepShop database. The neuropeptides in the PepShop were assembled from the 95 known mouse prohormones using information from SwePep,
UniProt and NeuroPred predictions. Significance p-values were computed as the relative frequency of the match
indicator values from the permutations of complete target peptide sequences in the
four k-permuted decoy databases (where k denotes the number of permutations) that were better
than or equal to the true spectra values. The k-permuted decoy databases identified
up to 100% of the neuropeptides relative to the approached already implemented in the three database search programs at p-value < 0.00001.
The permutation test p-values using the hyperscore (X!Tandem), E-value (OMSSA) and Sp score (Crux)
match indicators outperformed the other match indicators. Overall, the intuitive indicator number
of matched ions provided p-values comparable to the best match indicators. The k-permuted databases with 100,000 permutations per spectra are recommended for
the accurate assessment of neuropeptide detection significance levels and to increase the speed of search. The neuropeptides and prohormones, tandem mass spectra, and the source code used to generate k-permuted decoy
databases are available:
• List of Prohormones and Neuropeptides (view list)
• Source code and usage manual (download)
|