Molecular Biology and Genetics / Moleküler Biyoloji ve Genetik

Permanent URI for this collectionhttps://hdl.handle.net/11147/9

Browse

Search Results

Now showing 1 - 6 of 6
  • Conference Object
    Preparing Sequence Databases for Application in Proteogenomics
    (Springer, 2016) Has, Canan; Mungan, Mehmet Direnç; Çiftçi, Cansu; Allmer, Jens
    Proteomics involves the identification of proteins from complex mixtures which is performed using mass spectrometry (MS) followed by computational data analysis. MS/MS spectra can either be sequenced de novo if no sequence is available for the proteins in the mixture, or by using database search algorithms such as OMSSA, X!Tandem, and MSGF+.
  • Conference Object
    Database Normalization Is Crucial for Reliable Protein Identification in Mass Spectrometry-Based Proteomics
    (Springer, 2016) Has, Canan; Mungan, Mehmet Direnç; Çiftçi, Cansu; Allmer, Jens
    Research in proteomics is driven by mass spectrometry, especially the identification of proteins from complex samples. Computational analysis of the resulting data determines the peptide sequences of the recorded spectra and integrates identifications into proteins. For this, database search algorithms can be employed, but they need a list of amino acid sequences that are expected to exist in the sample. Many algorithms have been proposed and consensus scoring has been performed. While the comparison/integration among results from different algorithms is important, there has been no attempt to integrate the results from searching multiple databases. This is, however, important since it poses technical problems when all databases, needed for a study, are simply concatenated. Unfortunately, it has been shown that databases of different size influence scoring and prohibit the direct comparison of results.
  • Article
    Citation - WoS: 4
    Citation - Scopus: 9
    Pgminer Reloaded, Fully Automated Proteogenomic Annotation Tool Linking Genomes To Proteomes
    (Informationsmanagement in der Biotechnologie e.V. (IMBio e.V.), 2016) Has, Canan; Lashin, Sergey A.; Kochetov, Alexey; Allmer, Jens
    Improvements in genome sequencing technology increased the availability of full genomes and transcriptomes of many organisms. However, the major benefit of massive parallel sequencing is to better understand the organization and function of genes which then lead to understanding of phenotypes. In order to interpret genomic data with automated gene annotation studies, several tools are currently available. Even though the accuracy of computational gene annotation is increasing, a combination of multiple lines of experimental evidences should be gathered. Mass spectrometry allows the identification and sequencing of proteins as major gene products; and it is only these proteins that conclusively show whether a part of a genome is a coding region or not to result in phenotypes. Therefore, in the field of proteogenomics, the validation of computational methods is done by exploiting mass spectrometric data. As a result, identification of novel protein coding regions, validation of current gene models, and determination of upstream and downstream regions of genes can be achieved. In this paper, we present new functionality for our proteogenomic tool, PGMiner which performs all proteogenomic steps like acquisition of mass spectrometric data, peptide identification against preprocessed sequence databases, assignment of statistical confidence to identified peptides, mapping confident peptides to gene models, and result visualization. The extensions cover determining proteotypic peptides and thus unambiguous protein identification. Furthermore, peptides conflicting with gene models can now automatically assessed within the context of predicted alternative open reading frames.
  • Article
    Citation - WoS: 14
    Citation - Scopus: 16
    Transcriptomic Analysis of Boron Hyperaccumulation Mechanisms in Puccinellia Distans
    (Elsevier Ltd., 2018) Öztürk, Saniye Elvan; Göktay, Mehmet; Has, Canan; Babaoğlu, Mehmet; Allmer, Jens; Doğanlar, Sami; Frary, Anne
    Puccinellia distans, common alkali grass, is found throughout the world and can survive in soils with boron concentrations that are lethal for other plant species. Indeed, P. distans accumulates very high levels of this element. Despite these interesting features, very little research has been performed to elucidate the boron tolerance mechanism in this species. In this study, P. distans samples were treated for three weeks with normal (0.5 mg L−1) and elevated (500 mg L−1) boron levels in hydroponic solution. Expressed sequence tags (ESTs) derived from shoot tissue were analyzed by RNA sequencing to identify genes up and down-regulated under boron stress. In this way, 3312 differentially expressed transcripts were detected, 67.7% of which were up-regulated and 32.3% of which were down-regulated in boron-treated plants. To partially confirm the RNA sequencing results, 32 randomly selected transcripts were analyzed for their expression levels in boron-treated plants. The results agreed with the expected direction of change (up or down-regulation). A total of 1652 transcripts had homologs in A. thaliana and/or O. sativa and mapped to 1107 different proteins. Functional annotation of these proteins indicated that the boron tolerance and hyperaccumulation mechanisms of P. distans involve many transcriptomic changes including: alterations in the malate pathway, changes in cell wall components that may allow sequestration of excess boron without toxic effects, and increased expression of at least one putative boron transporter and two putative aquaporins. Elucidation of the boron accumulation mechanism is important in developing approaches for bioremediation of boron contaminated soils.
  • Article
    Citation - WoS: 2
    Citation - Scopus: 5
    Pgminer: Complete Proteogenomics Workflow; From Data Acquisition To Result Visualization
    (Elsevier Ltd., 2017) Has, Canan; Allmer, Jens
    In parallel with the development of nucleotide sequencing an equally important interest in further describing the sequence in terms of function arose and the latter represents the current bottleneck in the overall research question. Sequencing the transcriptome allows determination of expressed nucleotide sequences and using mass spectrometry allows sequencing on the protein level. Both approaches can only sequence a subset of the existing transcripts. Moreover, for example post translational modification events can only be determined on the proteomics level. Therefore, it is essential to combine proteomics and genomics. For that purpose, proteogenomics data analysis pipelines have been described. Here, we describe a novel proteogenomics workflow which encompasses everything from the acquisition of data to result visualization in the Konstanz Information Miner (KNIME), a state of the art workflow management and data analytics platform. We amended KNIME with a number of processes like peptide consensus prediction, peptide mapping, and database equalizing, as well as result visualization. This enabled construction of our new workflow, entitled PGMiner, which not only includes all data analysis steps, but is highly customizable which is rather cumbersome for most existing pipelines. Furthermore, no burdensome installation processes have to be performed making PGMiner the most user friendly tool available.
  • Conference Object
    Citation - Scopus: 1
    Ranking Tandem Mass Spectra: and the Impact of Database Size and Scoring Function on Peptide Spectrum Matches
    (Institute of Electrical and Electronics Engineers Inc., 2013) Has, Canan; Kundakçı, Cemal Ulaş; Altay, Aybuge; Allmer, Jens
    Proteomics is currently driven by mass spectrometry. For the analysis of tandem mass spectra many computational algorithms have been proposed. There are two approaches, one which assigns a peptide sequence to a tandem mass spectrum directly and one which employs a sequence database for looking up possible solutions. The former method needs high quality spectra while the latter can tolerate lower quality spectra. Since both methods are computationally expensive, it is sensible to establish spectral quality using an independent fast algorithm. In this study, we first establish proper settings for database search algorithms for the analysis of spectra in our gold benchmark dataset and then analyze the performance of ScanRanker, an algorithm for quality assessment of tandem MS spectra, on this ground truth data. We found that OMSSA and MSGFDB have limitations in their scoring functions but were able to form a proper consensus prediction using majority vote for our benchmark data. Unfortunately, ScanRanker's results do not correlate well with the consensus and ScanRanker is also too slow to be used in the capacity it is supposed to be used. © 2013 IEEE