Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik
Permanent URI for this collectionhttps://hdl.handle.net/11147/9
Browse
2 results
Search Results
Letter Citation - Scopus: 9A Call for Benchmark Data in Mass Spectrometry-Based Proteomics(Proteomass Scientific Society, 2012) Allmer, JensProteomics is a quickly developing field. New and better mass spectrometers, the platform of choice in proteomics, are being introduced frequently. New algorithms for the analysis of mass spectrometric data and assignment of amino acid sequence to tandem mass spectra are also presented on a frequent basis. Unfortunately, the best application area for these algorithms cannot be established at the moment. Furthermore, even the accuracy of the algorithms and their relative performance cannot be established. This is due to the lack of proper benchmark data. This letter first introduces the field of mass spectrometry-based proteomics and then defines the expectations of a well-designed benchmark dataset. Thereafter, the current situation is compared to this ideal. A call for the creation of a proper benchmark dataset is then placed and it is explained how measurement should be performed. Finally, the benefits for the research community are highlighted. © 2012, Proteomass Scientific Society. All rights reserved.Article Citation - Scopus: 1Determining the C-Terminal Amino Acid of a Peptide From Ms/Ms Data(Proteomass Scientific Society, 2013) Allmer, JensProteomics is currently chiefly based on mass spectrometry (MS) which is the tool of choice to investigate proteins. Two computational approaches to derive the tandem mass spectrum precursor’s sequence are widely employed. Database search essentially retrieves the sequence by matching the spectrum to all entries in a database whereas de novo sequencing does not depend on a sequence database. Both approaches benefit from knowledge about the enzyme used to generate the peptides. Most algorithms default to trypsin for its abundant usage. Trypsin cuts after arginine and lysine and thus the c-terminal amino acid is not known precisely and usually either of the two. Furthermore, 90% of protein terminal peptides may not end with either arginine or lysine and may thus contain any of the other amino acids. Here an algorithm is presented which predicts the c-terminal amino acid to be arginine, lysine or any other. Here an algorithm, named RKDecider, to sort the c-terminal amino acid into one of three groups (arginine, lysine, and other) is presented. Although around 90% accuracy was achieved during data mining spectra for rules that determine the c-terminal amino acid, the implementation’s (RKDecider) accuracy is a little less and achieves about 80%. This is due to the fact that the decision trees were implemented as a rulebased system for speed considerations. The implementation is freely available at: http://bioinformatics.iyte.edu.tr/RKDecider.
