Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 10 of 170

Privacy-Preserving Rare Disease Analysis With Fully Homomorphic Encryption
(01. Izmir Institute of Technology, 2023) Akkaya, Güliz; Erdoğmuş, Nesli; Akgün, Mete
Rare diseases severely affect many people across the world at the present time. Researchers conduct studies to understand the reasons behind rare diseases and as a result of this research, diagnosis, and treatment methods are developed. Rare disease analysis is performed to specify the disease-causing variants on the genome data of patients. The researchers need access to as much genome data as possible to find causing variants of rare diseases. On the other hand, the genome data of patients should be protected because it can be used to detect the identity of individuals. The researchers are not able to share the genome data of patients easily because of regulations such as General Data Protection Regulation (GDPR). For this reason, rare disease analysis should be performed in a secure way that protects the privacy of patients while enabling the collaboration of multiple medical institutions. In this context, a privacy-preserving collaborative system for rare disease analysis should be provided. This thesis study focuses on the utilization of fully homomorphic encryption, a method that enables unlimited number of operations to be performed on encrypted data, for privacy-preserving collaborative rare disease analysis. Two different methods, the boolean circuit method, and the integer arithmetic method, are implemented to perform rare disease analysis on the encrypted genome data to find disease-causing variants, and various experiments are performed to assess the efficiency of the proposed methods.
Multi-Frame Super-Resolution Without Priors
(01. Izmir Institute of Technology, 2023) Gülmez, Veli; Özuysal, Mustafa
There are mainly two types of super-resolution methods: traditional methods and deep learning methods. While traditional methods define closed-form expressions with assumptions, deep learning methods rely on priors learned from data sets. However, both of them have disadvantages such as being too simple and having strong trust in priors. We focus on how to generate a high-resolution image using low-resolution images without priors by utilizing spatial hash encoding. We propose a grid-based super-resolution model using spatial hash encoding to map coordinate information into higher dimensional space. Our aim is to eliminate long training times and not rely on priors from data sets that are not able to cover all real-world scenarios. Therefore, our proposed model is able to do task- specific super-resolution without priors and eliminate potential hallucination effects caused by wrong priors.
Enrichment of Turkish Question Answering Systems Using Knowledge Graphs
(01. Izmir Institute of Technology, 2023) Çiftçi, Okan; Tekir, Selma; Soygazi, Fatih
In the era of digital communication, the ability to effectively process and interpret human language has become a key research area. Natural Language Processing (NLP) has emerged as a field that enables machines to better understand and analyze human language. One of the most important applications of NLP is the development of question answering systems, which are essential in various domains such as customer service, search engines, and chatbots. To answer incoming queries, question answering systems rely on knowledge graphs as a reliable source. This thesis proposes a Turkish Question Answering (TRQA) system that utilizes a knowledge graph. The research focuses on the automatic construction of a knowledge graph specific to the film industry, as well as the creation of a multi-hop question-answering dataset that can be queried from this graph. Building upon these constructions, we develop a deep learning based method for answering questions using the constructed knowledge graph. The constructed knowledge graph is compared with various knowledge graphs presented in the literature using DistMult, ComplEx and SimplE methods for the link prediction task. Additionally, the proposed question answering system is compared with the baseline study and compared with a generative large language model through quantitative and qualitative analyses.
Reproducibility Assessment of Research Code Repositories
(01. Izmir Institute of Technology, 2023) Akdeniz, Eyüp Kaan; Tekir, Selma
The growth in machine learning research has not been accompanied by a corresponding improvement in the reproducibility of the results. This thesis presents a novel, fully-automated end-to-end system that evaluates the reproducibility of machine learning studies based on the content of the associated GitHub project's Readme file. This evaluation relies on a readme template derived from an analysis of popular repositories. The template suggests a structure that promotes reproducibility. Our system generates a reproducibility score for each Readme file assessed, and it employs two distinct models, one based on section classification and the other on hierarchical transformers. The experimental outcomes indicate that the system based on section similarity outperforms the hierarchical transformer model. Furthermore, it has a superior edge concerning explainability, as it allows for a direct correlation of the scores with the respective sections of the Readme files. The proposed framework provides an important tool for improving the quality of code sharing and ultimately helps to increase reproducibility in machine learning research.
Row Following and Altitude Estimation With Uav Images for Agricultural Fields
(01. Izmir Institute of Technology, 2023) Yörük, Burak; Baştanlar, Yalın
Traditional methods in agriculture involve the use of tractors; however, more than 10\% of the planted fields suffer from harvest losses due to these vehicles. Moreover, tractors cannot enter all agricultural lands, thus reducing the available field for planting. After heavy rainfall, mud and other effects prevent these vehicles from accessing arable field, and processes such as crop spraying take significantly longer. In the past, aerial spraying methods using high altitude aircraft were attempted to overcome these problems; however, this method was banned in many areas due to the insufficient altitude and the harmful effects of chemical dispersion outside the fields. Nowadays, UAVs present a better alternative, and aerial spraying methods are regaining popularity. However, these vehicles can still cause errors when flying with a human operator, and their flight times are limited due to inadequate battery capacity. Therefore, the development of UAVs capable of autonomous flight reduces operator costs. However, during flight, liquid changes in the pesticide tanks hinder the UAV's ability to spray pesticides autonomously at a fixed altitude and prevent unwanted pesticide dispersion in undesirable rows. The thesis study provides following of plant rows on UAV images and making altitude estimation from camera images. In this way, it ensures that the UAVs in agricultural areas can stay at a fixed altitude for appropriate spraying and irrigation and prevents the spread of pesticides to unwanted rows.
Recognition of Counterfactual Statements in Turkish
(01. Izmir Institute of Technology, 2023) Acar, Ali; Tekir, Selma
Counterfactual statements describe an event that did not happen or cannot happen, and optionally the consequence of this event if it would happen. Counterfactual statements are the building blocks of human thought processes as people constantly reflect upon past happenings and consider their future implications. Counterfactual reasoning is essential for machine intelligence and explainable artificial intelligence studies. Detecting counterfactuals automatically with machine learning algorithms is very crucial for these areas. This thesis presents the development of the first-ever Turkish counterfactual detection dataset. It presents a comprehensive classification baseline and expands the scope of counterfactual detection to include the Turkish language.
A Mutation-Based Approach To Alleviate the Class Imbalance Problem in Software Defect Prediction
(01. Izmir Institute of Technology, 2023) Güner, Dinçer; Demirörs, Onur; Demirörs, Onur; Giray, Görkem
Highly imbalanced training datasets considerably degrade the performance of software defect predictors. Software Defect Prediction (SDP) datasets have a general problem, which is class imbalance. Therefore, a variety of methods have been developed to alleviate Class Imbalance Problem (CIP). However, these classical methods, like data-sampling, balance datasets without connecting any relation with SDP. Over-sampling techniques generate synthetic minor class instances, which generalize a small number of minor class instances and result in less diverse instances, whereas under-sampling techniques eliminate major class instances, resulting in significant information loss. In this study, we present an approach that uses software mutations to balance software repositories. Mutation-based Approach (MBA) injects mutants into defect-free instances, causing them to transform into defective instances. In this way, MBA balances datasets with diverse data produced by mutation operators, and there is no loss on instances as in under-sampling. For recall scores, almost all rebalancing methods outperformed Baseline in Inter-release Defect Prediction (IRDP) scenario but only MBA significantly outperformed Baseline in Cross-project Defect Prediction (CPDP) scenario. The performance increase in recall resulted in the production of more false alarms. We can not generalize that MBA outperforms Baseline and the five over-sampling strategies in terms of AUC scores. In terms of recall values, the MBA performed better in CPDP than IRDP. For both IRDP and CPDP scenarios, there were significant and positive correlations between SMC (the change percentage of software measures) and recall, and SMC and false alarm but there was no significant correlation between SMC and AUC.
Evaluating Impacts of Micro-Architectural Metrics on Error Resilience and Performance of General Purpose Gpu Applications
(01. Izmir Institute of Technology, 2023) Topçu, Burak; Öz, Işıl
Rapidly growing data processing tasks require powerful and energy-efficient heterogeneous computing systems, and GPUs take on a significant mission for those systems in accelerating heavy workloads by executing multiple parallel tasks concurrently. Increasing architectural complexity and widening employment of GPUs bring error resiliency concerns for safety-critical applications. Furthermore, approaches that enhance performance and reduce energy dissipation handle error resiliency on GPUs through approximate computing solutions. Evaluating error resiliency in terms of either identifying error proneness of a system or investigating approximations without much disturbing the output necessities robust knowledge about the execution of a program on a device. In this thesis, we develop a runtime performance and power monitoring tool visualizing the execution with detailed micro-architectural metrics. By utilizing the tool, we acquire several fundamental understandings about runtime performance bottlenecks and how perturbations affect output quality. Afterward, we propose a framework predicting fault vulnerability for error-resilient GPU applications. The framework can accurately estimate error tolerance and saves from analyzing the fault occurrence probability requiring significant effort. Depending on the performance bottlenecks observed with the tool and the error propagation gained during prediction experiments, we introduce a hardware-based approximation computing approach targeting to improve the performance and power of GPU programs, especially memory-bound ones. The approximation method, which resolves memory utilization bottlenecks at runtime, enhances performance by 1.49× (up to 2.1×) and diminishes energy consumption by 28.4% (up to 52.6%) while maintaining the accuracy on the output above 98%.
The Realization of a Blockchain-Based E-Voting Solution With a New Consensus Algorithm
(01. Izmir Institute of Technology, 2022) Karaçay, Mustafa; Şahin, Serap
Security and transparency issues in the paper-based voting system and technological advances popularized e-voting systems. Many academic research and industrial solutions have recently been proposed, designed, and implemented with a Homomorphic Cryptography Scheme or HTTPS. However, there is a new popular player in the game which is called blockchain technology. This study analyzes the requirements of a welldesigned e-voting system and the technology behind the blockchain, and proposes an e-voting system with a novel consensus algorithm. Different strategies are designed and implemented to satisfy all requirements. First, RSA and Paillier Homomorphic Cryptosystem are applied to meet requirements such as individual verifiability, secrecy, etc. So that no one can modify the vote; however, any voter can verify his/her vote during the whole vote period. Second, different blockchains are used to meet requirements such as eligibility, privacy, authentication, etc. So that the system detects whether the data is coming from an eligible or a non-eligible voter. The system ensures that votes and voters can not be correlated if it is an eligible voter. So, the privacy of eligible voters is always protected. Third, our blockchains ensure Consensus throughout the voting process. Fully replicated, distributed, transparent, and secure blockchains ensure that everything is under control. Fourth, internal control mechanisms are applied to meet requirements such as nonreusability, coercion-resistance, etc. So that eligible voters can cast just one vote within the specified period. The system keeps every sensitive data encrypted so that no one manipulates the results before the vote ends.
Automatic Quote Detection From Literary Work
(01. Izmir Institute of Technology, 2022) Güzel Altıntaş, Aybüke; Tekir, Selma
Literature inspires readers, and readers tend to share quotes from a literary work. The reader underlines the quotes in the book and shares them on social media, or on an online platform used by book readers. The definition of a quote is a span in a written text that is interesting for many readers and readers can use the quote in different contexts. In this study, a novel task in the field of Natural Language Processing is proposed: the Quote Detection Task. Also, an original dataset was formed from the Goodreads and Gutenberg websites with web scraping. Quotes are Goodreads data sourced from Kaggle and data that has been voted by 10 or more users are selected. These quotes have been validated with the books on the Project Gutenberg website. The final dataset consists of 4554 rows. The dataset contains quotes with their book spans. The span of a quote consists of the previous 10 sentences of the quote, the quote itself, and the following 10 sentences of the quote. Conditional Random Field (CRF) and Extractive Summarization as Text Matching (MatchSum) were run as two different baselines for quote detection. The Quote Detection Task is span detection that can be modeled with sequence labeling solutions and Neural extractive summarization systems in the literature. For this sequence tagging problem, the statistics-based CRF was run as first baseline. Extractive Summarization as Text Matching baseline is the second baseline chosen for the experimental part. Rouge-1 scores of 27.24% and 40.54%, respectively, were obtained from these baselines.

Master Degree / Yüksek Lisans Tezleri

Browse

Filters

Settings

Sort By

Results per page

Search Results