Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 5 of 5
  • Master Thesis
    Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [master Thesis]
    (01. Izmir Institute of Technology, 2021) Çakı, Onur; Karaçalı, Bilge
    In-silico prediction of compound-protein interaction using computational methods preserves its importance in various pharmacology applications because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning algorithm. In this framework, potential interactions are predicted by estimating the overlap between two datasets: a true positive dataset which consists of compound-protein pairs with known interactions and an unknown dataset which consists of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure between interacting pairs. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.
  • Master Thesis
    A Comparative Analysis of Coherence Measures for Electroencephalography
    (Izmir Institute of Technology, 2018) Çağdaş, Serhat; Karaçalı, Bilge
    Functional connectivity is often used in brain-computer interface studies as well as other neuroscience fields as a feature extraction method. In the functional connectivity using electroencephalography (EEG), connectivity patterns are extracted by a dependency matrix showing the coherence between electrode pairs. A variety of dependence measures can be used to calculate this matrix. In this study, a total of 15 coherence measures were analyzed comparatively in terms of computation time, accuracy and statistical significance in discriminating motor/motor imagery activities. As dependence measures, in addition to methods used in the literature for brain connectivity, five other methods used as contrast function in independent component analysis and two novel mutual information calculators proposed in this study were evaluated. Furthermore, a novel hierarchical clustering based statistical test procedure was also proposed for motor/motor imagery activity comparison, along with a similar statistical significance test applied on data from 103 subjects on four different activity types. In experiments on real data set, significance results of dependence measures differed according to the type of activity and time window duration of activity signals. Considering both computation time and accuracy performances on synthetic data, a number of methods with high statistical significance and different dependence characteristics were identified as feasible for a connectivity based brain-computer interface.
  • Master Thesis
    Constructing Reference Datasets for Evaluating Automated Compensation Algorithms in Multicolor Flow Cytometry
    (Izmir Institute of Technology, 2017) Arslan, Nurhan; Karaçalı, Bilge
    In this thesis, we develop a numerical framework to simulate flow cytometry readings on BD FACSCanto flow cytometer by constructing cell profiles with specific target biomarker concentrations and modelling various physical phenomena involved in a flow cytometer. The principal aim of this thesis is to provide realistic datasets over which prospective automated compensation algorithms can be evaluated. In our study, we have first constructed model cell profiles based on human lymphocytes stained with fluorescent dyes. We secondly focused on determining the number of photons emitted from each fluorochrome-conjugated target proteins in a cell through fluorescence following excitation. We thirdly simulated the optic channel of BD FACSCanto flow cytometer and implemented a stochastic photon counting method to determine fluorescence intensity received in the different detectors. Then, we simulated a pre-amplifier circuit to calculate the detector responses as voltage pulses from each cell in response to received photons. Using the completed platform, we have generated a two-colour flow cytometry dataset including + +, + -, - +, and - - cell groups using FITC and PE fluorochromes. We demonstrated the usefulness of the generated reference datasets by applying two different linear compensation methods and comparing the resulting compensated datasets in both linear and logarithmic scales. These results suggest that the developed platform can be used to generate realistic multi-colour flow cytometry datasets that can be used to validate compensation algorithms.
  • Master Thesis
    Ray: a Profile-Based Approach for Homology Matching of Tandem-Ms Spectra To Sequence Databases
    (Izmir Institute of Technology, 2012) Yılmaz, Şule; Allmer, Jens; Karaçalı, Bilge
    Mass spectrometry is a tool that is commonly used in proteomics to identify and quantify proteins. Thousands of spectra can be obtained in just few hours. Computational methods enable the analysis of high-throughput studies. There are mainly two strategies: database search and de novo sequencing. Most of the researchers prefer database search as a first choice but any slight changes on protein can prevent identification. In such cases, de novo sequencing can be used. However, this approach highly depends on spectral quality and it is difficult to achieve predictions with full length sequence. Peptide sequence tags (PST) allows some flexibility on database searches. A PST is a short amino acid sequence with certain mass information but obtaining accurate PST is still arduous. In case a sequence is missing in database, homology searches can be useful. There are some homology search algorithms such as MS-BLAST, MS-Shotgun, FASTS. But, they are altered versions of existing algorithms, for example BLAST has been modified for mass spectrometric data and became MS-BLAST. Besides, they are usually coupled with de novo sequencing which still possess limitations. Therefore, there is a need for novel algorithms in order to increase the scope of homology searches. For this purpose, a novel approach that is based on sequence profiles has been implemented. A sequence profile is like a table that contains frequencies of all possible amino acids on a given MS/MS spectrum. Then, they are aligned to sequences in database. Profiles are more specific than PSTs and the requirement for precursor mass restrictions or enzyme information can be removed.
  • Master Thesis
    Separation of Stimulus-Specific Patterns in Electroencephalography Data Using Quasi-Supervised Learning
    (Izmir Institute of Technology, 2011) Köktürk, Başak Esin; Karaçalı, Bilge
    In this study separation of the electroencephalography data recorded under different visual stimuli is investigated using the quasi-supervised learning algorithm. The quasi-supervised learning algorithm estimates the posterior probabilities associated with the different stimuli, thus identifying the EEG data samples that are exclusively specific to their respective stimuli directly and automatically from the data. The data used in this study contains 32 channels EEG recording under six different visual stimuli in random successive order. In our study, we have first constructed EEG profiles to represent instantaneous brain activity from the EEG data by various combinations of independent component analysis and the wavelet transform following data preprocessing. Then, we have applied the binary and M-ary quasi-supervised learning to identify condition-specific EEG profiles in different comparison scenarios. The results reveal that the quasi-supervised learning algorithm is successful in capturing the distinction between the samples. In addition, feature extraction using independent component analysis increased the performance of the quasi-supervised learning and the wavelet decomposition revealed the different frequency bands of the features, making more explicit the separation of the samples. The best results we obtained by combining the wavelet decomposition and the independent component analysis before the quasisupervised learning algorithm.