Importance of Database Normalization for Reliable Protein Identification in Mass Spectrometry-Based Proteomics

Mungan, Mehmet Direnç

Importance of Database Normalization for Reliable Protein Identification in Mass Spectrometry-Based Proteomics

dc.contributor.advisor	Allmer, Jens
dc.contributor.advisor	Yalçın, Talat
dc.contributor.author	Mungan, Mehmet Direnç
dc.date.accessioned	2017-06-07T06:56:35Z
dc.date.available	2017-06-07T06:56:35Z
dc.date.issued	2016
dc.description	Full text release delayed at author's request until 2018.01.27	en_US
dc.description	Includes bibliographical references (leaves: 30-37)	en_US
dc.description	Text in English; Abstract: Turkish and English	en_US
dc.description	xi, 37 leaves	en_US
dc.description	Thesis (Master)--Izmir Institute of Technology, Biotechnology, Izmir, 2016	en_US
dc.description.abstract	One of the revolutionary steps towards proteomics, was introducing mass spectrometry to protein inference analysis. Its powerful aspects such as speed, and accuracy towards identifying and quantifying proteins have made it the first choice to obtain highthroughput data. Due to development of a variety of fragmentation techniques, mass spectrometry-based analysis even made it possible to acquire knowledge about single polymorphisms and modifications of amino acids of a peptide. Although this technology provides enormous amounts of data, identification of the proteins is still a hard challenge to overcome due to the shortcomings of computational methods. Herein a novel methodology is offered to better analyze mass spectrometry data and overcome the deficiency of protein identification algorithms in terms of speed and accuracy. When the spectral data is acquired from an organism by mass spectrometry, database search algorithms are used for protein identification if the protein sequences of the organism are known. These algorithms compare the experimental data from mass spectrometry analysis to theoretical data gathered from known databases of organism to try and find the best match by ranking the PSMs via scoring functions. Since the databases can be too large to search and multiple databases with different sizes can contain the peptides of experimental data, database search algorithms may fail to produce fair, fast or complete results. In this work a methodology is presented to overcome unfair scoring of peptides in different size databases and enable database search algorithms to utilize relatively big sized entries such as human chromosome six frame translations. In terms of speed and accuracy the method is found to be better than some of the existing methods.	en_US
dc.description.abstract	Protein tanımlaması çalışmalarında kütle spektrometrinin kullanılması proteomik alanındaki devrim niteliğindeki adımlardan biri oldu. Protein nicelik ve nitelik belirlemelerindeki doğruluk ve hızlı olması gibi özellikleriyle, yüksek-işleme veri alımında kullanılmak üzere ilk seçim haline geldi. Farklı fragmentasyon yöntemlerinin geliştirilmesiyle, kütle spektrometri tabanlı analizler, bir peptiddeki tekli polimorfizmleri ve amino asitlerdeki modifikasyonlarla ilgili bilgi edinilmesini bile mümkün kıldı. Bu teknolojinin muazzam ölçülerde veri üretmesine rağmen, protein tanımlama çalışmaları, hesaplamalı metodların eksikliklerinden dolayı, aşılması güç bir hedef halinde. Bu çalışmada, protein tanımlama algoritmalarının protein belirlemedeki eksikliklerinin üstesinden gelmek ve kütle spektrometri verilerini hız ve doğruluk yönlerinden daha iyi analiz etmek için orjinal bir algoritma önerilmiştir. Bir organizmadan kütle spektrometri aracılığıyla spektral veri elde edildiğinde, eğer organizmanın protein sekansları bilinmekteyse, protein tanımlaması için veritabanı arama algoritmaları kullanılır. Bu algoritmalar, peptid-spektrum eşleşmelerindeki en iyi eşleşmeyi skorlama fonksiyonlarına gre bulmak için, kütle spektrometri analizlerinden alınan deneysel verileri, organizmaya ait veritabanlarından elde edilen teorik verilerle karşılaştırır.	en_US
dc.identifier.citation	Mungan, M. D. (2016). Importance of database normalization for reliable protein identification in mass spectrometry-based proteomics. Unpublished master's thesis, İzmir Institute of Technology, İzmir, Turkey	en_US
dc.identifier.uri	https://hdl.handle.net/11147/5710
dc.language.iso	en	en_US
dc.publisher	Izmir Institute of Technology	en_US
dc.relation	Alternatif Açık Okuma Çerçeveleri’ne Ait Yeni İnsan Proteinlerinin Tespiti ve Doğrulanması	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Protein identification	en_US
dc.subject	Mass spectrometry	en_US
dc.subject	Database search algorithms	en_US
dc.subject	Proteomics	en_US
dc.title	Importance of Database Normalization for Reliable Protein Identification in Mass Spectrometry-Based Proteomics	en_US
dc.title.alternative	Kütle Spektrometri Tabanlı Proteomik Çalışmalarındaki Güvenilir Protein Tanımlanmasında Veritabanı Normalizasyonunun Önemi	en_US
dc.type	Master Thesis	en_US
dspace.entity.type	Publication
gdc.author.institutional	Mungan, Mehmet Direnç
gdc.coar.access	open access
gdc.coar.type	text::thesis::master thesis
gdc.description.department	Thesis (Master)--İzmir Institute of Technology, Bioengineering	en_US
gdc.description.publicationcategory	Tez	en_US
gdc.description.scopusquality	N/A
gdc.description.wosquality	N/A
relation.isAuthorOfPublication.latestForDiscovery	bf9f97a4-6d62-49cd-a7c8-1bc8463d14d2
relation.isOrgUnitOfPublication.latestForDiscovery	9af2b05f-28ac-4011-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1

Name:: T001562.pdf
Size:: 1.78 MB
Format:: Adobe Portable Document Format
Description:: MasterThesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Master Degree / Yüksek Lisans Tezleri