Master Degree / Yüksek Lisans Tezleri
Permanent URI for this collectionhttps://hdl.handle.net/11147/3008
Browse
2 results
Search Results
Master Thesis Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [master Thesis](01. Izmir Institute of Technology, 2021) Çakı, Onur; Karaçalı, BilgeIn-silico prediction of compound-protein interaction using computational methods preserves its importance in various pharmacology applications because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning algorithm. In this framework, potential interactions are predicted by estimating the overlap between two datasets: a true positive dataset which consists of compound-protein pairs with known interactions and an unknown dataset which consists of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure between interacting pairs. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.Master Thesis Ray: a Profile-Based Approach for Homology Matching of Tandem-Ms Spectra To Sequence Databases(Izmir Institute of Technology, 2012) Yılmaz, Şule; Allmer, Jens; Karaçalı, BilgeMass spectrometry is a tool that is commonly used in proteomics to identify and quantify proteins. Thousands of spectra can be obtained in just few hours. Computational methods enable the analysis of high-throughput studies. There are mainly two strategies: database search and de novo sequencing. Most of the researchers prefer database search as a first choice but any slight changes on protein can prevent identification. In such cases, de novo sequencing can be used. However, this approach highly depends on spectral quality and it is difficult to achieve predictions with full length sequence. Peptide sequence tags (PST) allows some flexibility on database searches. A PST is a short amino acid sequence with certain mass information but obtaining accurate PST is still arduous. In case a sequence is missing in database, homology searches can be useful. There are some homology search algorithms such as MS-BLAST, MS-Shotgun, FASTS. But, they are altered versions of existing algorithms, for example BLAST has been modified for mass spectrometric data and became MS-BLAST. Besides, they are usually coupled with de novo sequencing which still possess limitations. Therefore, there is a need for novel algorithms in order to increase the scope of homology searches. For this purpose, a novel approach that is based on sequence profiles has been implemented. A sequence profile is like a table that contains frequencies of all possible amino acids on a given MS/MS spectrum. Then, they are aligned to sequences in database. Profiles are more specific than PSTs and the requirement for precursor mass restrictions or enzyme information can be removed.
