Master Degree / Yüksek Lisans Tezleri
Permanent URI for this collectionhttps://hdl.handle.net/11147/3008
Browse
15 results
Search Results
Master Thesis Hardware acceleration with fpga Based electronic boards for machine learning(01. Izmir Institute of Technology, 2024) Akkuş, Batuhan; Gümüş, Abdurrahman; Apaydın, Mehmet SerkanSon yıllardaki makine ög˘renmesi algoritmalarındaki gelis¸meler uç cihazlardaki kullanımını da arttırmıs¸tır (Merenda et al., 2020). Makine ög˘renimi algoritmaları genel- likle GPU tabanlı bilgisayarlarda çalıs¸tırılmaktadır, bu da yüksek enerji tüketimi (De- sislavov et al., 2021), yog˘un donanım kaynag˘ı gereksinimleri ve büyük fiziksel boyutları (Liu et al., 2022) nedeniyle uç cihazlar için uygun olmamaktadır. Bu tez, donanım hızlandırıcısı olarak FPGA platformlarında makine ög˘renmesi algoritmalarının, özellikle derin sinir ag˘larının uygulanması ve çıkarım yapılmasını aras¸tırarak, düs¸ük güç tüke- timi, verimli donanım kullanımı ve yüksek çıkarım performansı elde etmeyi hedefle- mektedir. Bu sistemlerin uç cihazlara adaptasyonu için esneklig˘i ve verimlilig˘i artırmak amacıyla, CNV ag˘ının (Umuroglu et al., 2017b) daha hafif bir varyasyonu olan CNV light gelis¸tirilmis¸. Bu ag˘, PyTorch tabanlı bir araç olan Brevitas (Pappalardo et al., 2019) ile nicemleme-farkında-eg˘itim yöntemi, kullanılarak 1, 2, 4 ve 8-bit seviyelerine nicemleme yapılmıs¸tır. CNV light ag˘ı CIFAR-10, SVHN, GTSRB ve MNIST veri setleri üzerinde Brevitas ile eg˘itilmis¸tir. Modeller FINN çerçevesi (Umuroglu et al., 2017a) kullanılarak FPGA'ya sentezlenmis¸tir. Modeller en fazla, en az ve sabit FPS seviye donanım kul- lanımına göre ayarlanmıs¸tır. Xilinx XC7Z020-1CLG400C FPGA, modelin metriklerini deg˘erlendirmek ve raporlamak için kullanılmıs¸tır. GTSRB veri setinde, ikili (W1A1) nicemleme yapılmıs¸ CNV light ag˘ı, tüm donanım kullanımları için %95.12 dog˘ruluk ve en fazla donanım kullanımında 12,191 FPS performansı ve 3.20W güç tüketimi elde etti, minimum donanım kullanımı için ise 6 FPS ve 1.62W güç tüketti. Sonuçlar, FPGA'ların uç cihazlarda makine ög˘renmesi modellerini verimli ve ölçeklenebilir platformlar olarak kullanılabileceg˘ini göstermektedir.Master Thesis Estrus Detection in Cows With Deep Learning Techniques(01. Izmir Institute of Technology, 2024) Arıkan, İbrahim; Ayav, Tolga; Soygazi, FatihAccurately predicting the estrus period is essential for enhancing the efficiency and lowering the costs of artificial insemination in livestock, a crucial sector for global food production. Precisely identifying the estrus period is critical to avoid economic losses such as decreased milk production, delayed calf births, and loss of eligibility for government subsidies. Since the most obvious movement that needs to be detected during the fertilization period is mounting, it is important to detect this movement. Since manual detection of this movement is difficult and costly, automated methods were needed. Therefore, it is thought that deep learning-based methods can be applied to detect the mounting moment. The proposed method detects the estrus period using deep learning and XAI (Explainable Artificial Intelligence) techniques. Deep learning-based mounting detection is performed using CNN, ResNet, VGG-19 and YOLO-v5 models. The ResNet model in this proposed study detects mounting movement with 99% accuracy. Explainability of deep learning models describes features that aid in decision-making in detecting mounting motion. Grad-CAM and Gradient Inputs models, which are XAI techniques, are used for the black box behind the proposed models. The developed deep learning models reveal that they focus on the udder and back area of the cows during the decision-making phase. In addition, how successfully the Grad-CAM and Gradient Inputs models, which are the XAI models used for the explainability of the deep learning models trained in this study, performed the explanation process was measured by calculating the 'faithfulness', 'maximum sensitivity' and 'complexity' metrics.Master Thesis Development of Visual Analysis Interfaces for Large Biological Data and Characterization of Immunomodulatory Noncoding Rna Networks Cancer(01. Izmir Institute of Technology, 2023) Kuş, Muhammet Emre; Ekiz, Hüseyin Atakan; Ekiz, Hüseyin AtakanThese days we are collecting data in higher and higher dimensions, processing it, and developing tools that have strong descriptive and predictive powers. Especially in the field of cancer, the processing of data collected from patients has substantial potential in terms of discovering new biomarkers, developing personalized treatment methods, and better prognosticators. However, there are significant difficulties in utilizing and analyzing high-dimensional data. A good level of coding skills is required to bring the data together and apply different analysis methods. With the visual interfaces created in this study, we offer the opportunity to examine and analyze the high-dimensional data of thousands of cancer patients, which are open to the public through The Cancer Genome Atlas initiative, especially for bench scientists who has no prior coding expertise. The Cancer Genome Explorer, shortly TCGEx, is a robust bioinformatic tool that we developed to facilitate high-throughput cancer data analysis through several sophisticated algorithms. With special features like subset-specific analysis and comparative analysis by using multiple cancer data, TCGEx can contribute to the literature by accelerating the studies, especially in hypothesis-driven research. This study also describes a use-case scenario that demonstrates how hypothesis-driven research can be performed using TCGExplorer for melanoma. In melanoma, elucidating the interactions between the tumor and the immune system at the miRNA level is crucial for developing new therapeutics. In this study, we characterize the properties of potential therapeutic targets that act on tumor and immune cells, which we have identified using various statistical analysis methods including machine learning, dimensionality reduction, and survival modeling using the TCGEx portal.Master Thesis Reproducibility Assessment of Research Code Repositories(01. Izmir Institute of Technology, 2023) Akdeniz, Eyüp Kaan; Tekir, SelmaThe growth in machine learning research has not been accompanied by a corresponding improvement in the reproducibility of the results. This thesis presents a novel, fully-automated end-to-end system that evaluates the reproducibility of machine learning studies based on the content of the associated GitHub project's Readme file. This evaluation relies on a readme template derived from an analysis of popular repositories. The template suggests a structure that promotes reproducibility. Our system generates a reproducibility score for each Readme file assessed, and it employs two distinct models, one based on section classification and the other on hierarchical transformers. The experimental outcomes indicate that the system based on section similarity outperforms the hierarchical transformer model. Furthermore, it has a superior edge concerning explainability, as it allows for a direct correlation of the scores with the respective sections of the Readme files. The proposed framework provides an important tool for improving the quality of code sharing and ultimately helps to increase reproducibility in machine learning research.Master Thesis Machine Learning Based Resource Allocation for Massive Mimo Systems(01. Izmir Institute of Technology, 2023) Sevgi, Hüseyin Can; Özbek, BernaCell-free massive MIMO communication systems is a promising technology that uses access-points(APs) deployed throughout the coverage area instead of usual cellular systems with centralized BS to serve multiple users simultaneously. By exploiting the large number of antennas and adopting advanced signal processing techniques, cell-free massive MIMO can mitigate inter-user interference and enhance the overall system performance. Optimal power allocation plays a crucial role in maximizing the spectral and energy efficiency of wireless networks. By intelligently allocating transmit power to different users, a balance between maximizing the system throughput and minimizing the total energy consumption can be achieved. In addition, user-centric clustering(UCC) is also a key technique to improve the performance of cell-free massive MIMO systems. This technique aims to pair user equipments (UEs) with appropriate APs to facilitate efficient resource allocation and interference management. In this thesis, cell-free mMIMO communication system is investigated through user-centric clustering and power allocation. The power allocation optimization problem is formulated to maximize energy efficiency of cell-free mMIMO systems and solved by using interior-point algorithm. User-centric clustering algorithm is proposed by disabling the non-master APs that are serving only one user. This additional feature aims to reduce total power consumption of the system without sacrificing the advantages of the cell-free mMIMO communication systems. Additionally, we propose a machine learning(ML) approach to reduce the computation time required for power allocation optimization. Through extensive simulations, we demonstrate the effectiveness of the proposed algorithms in achieving significant gains in spectral and energy efficiency in cell-free massive MIMO systems. The results highlight the importance of optimal power allocation and user-centric clustering to design an efficient cell-free mMIMO systems through machine learning approach.Master Thesis Recognition of Counterfactual Statements in Turkish(01. Izmir Institute of Technology, 2023) Acar, Ali; Tekir, SelmaCounterfactual statements describe an event that did not happen or cannot happen, and optionally the consequence of this event if it would happen. Counterfactual statements are the building blocks of human thought processes as people constantly reflect upon past happenings and consider their future implications. Counterfactual reasoning is essential for machine intelligence and explainable artificial intelligence studies. Detecting counterfactuals automatically with machine learning algorithms is very crucial for these areas. This thesis presents the development of the first-ever Turkish counterfactual detection dataset. It presents a comprehensive classification baseline and expands the scope of counterfactual detection to include the Turkish language.Master Thesis Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [master Thesis](01. Izmir Institute of Technology, 2021) Çakı, Onur; Karaçalı, BilgeIn-silico prediction of compound-protein interaction using computational methods preserves its importance in various pharmacology applications because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning algorithm. In this framework, potential interactions are predicted by estimating the overlap between two datasets: a true positive dataset which consists of compound-protein pairs with known interactions and an unknown dataset which consists of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure between interacting pairs. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.Master Thesis Classification of Contradictory Opinions in Text Using Deep Learning Methods(01. Izmir Institute of Technology, 2020) Oğul, İskender Ülgen; Tekir, SelmaNatural language inference (NLI) problem aims to ensure consistency as well as accuracy of propositions while making sense of natural language. Natural language inference aims to classify the relationship between two given sentences as contradiction, entailment or neutrality. To accomplish the classification task, sentences or words must be translated into mathematical representations called vectors or embedding. Vectorization of a sentence is as important as the complexity of the classification model. In this study, both pre-trained (Glove, Fasttext, Word2Vec) and contextual word embedding methods (BERT) were used for comparison and acquire the best result. One of the natural language processing tasks NLI, is highly complex and requires solutions. Conventional machine learning methods are insufficient to carry out natural language processing solutions. Therefore, more advanced solutions are required. This study used deep learning methods to perform the classification task. Unlike conventional machine learning approaches, deep learning approaches reduce errors while increasing accuracy by repeating the data many times. Opinion sentences have complex grammatical structures that are difficult to classify. This study used Decomposable Attention and Enhanced LSTM for natural language inference to perform NLI classification task. Using the advanced LSTM deep learning method and Bert contextual vectors for natural language extraction on the SNLI dataset, an accuracy result 88.0% very close state of the art result 92.1% was obtained. In order to show the usability of the developed solution in different NLI tasks, an accuracy of 80.02% was obtained in the studies performed on the MNLI data set.Master Thesis Predictive Maintenance for Smart Industry(01. Izmir Institute of Technology, 2020) Asadzade, Asad; Ayav, TolgaAfter the internet of things developed rapidly, it started to be used in many several industrial areas. Thanks to IoT, data that affect the health of any equipment or other important systems are collected. When these data are processed correctly, important information about the production process is obtained. For example, thanks to this data, systems based on machine learning are created to predict when various components will fail. Thus, maintenance operations are carried out before the component's breakdown, and replacement operations are performed if necessary. This strategy, called predictive maintenance, provides industries with advantages such as maximizing the life of components, reducing extra costs, and time saving. In this study, we applied ARF method, which is based on stream learning, on Turbofan Engine Degradation Simulation Datasets which are provided by NASA to estimate the remaining useful lifetime of jet engines. As a result, we mentioned about the advantages of streaming learning over batch learning and compared our results with other batch learning based studies which are applied on the same datasets.Master Thesis Estimation of Low Sucrose Concentrations and Classification of Bacteria Concentrations With Machine Learning on Spectroscopic Data(Izmir Institute of Technology, 2019) Mezgil, Bahadır; Baştanlar, Yalın; Baştanlar, YalınSpectroscopy can be used to identify elements. In a similar way, there are recent studies that use optical spectroscopy to measure the material concentrations in chemical solutions. In this study, we employ machine learning techniques on collected ultraviolet-visible spectra to estimate the level of sucrose concentrations in solutions and to classify bacteria concentrations. Some metal nanoparticles are very sensitive to refraction index changes in the environment and this helps to detect small refraction index changes in the solution. In our study, gold nanoparticles are used and we benefited from this property to estimate sucrose concentrations. The samples in different low sucrose concentration solutions are obtained by mixing the sucrose measured with precision scales with pure water and then the UV-Vis spectrum of each sample is measured. For the bacteria concentration solutions, spectra for six different bacteria concentrations are captured. Spectra of the same solutions are also captured before adding the bacteria. For each of these solutions, four sets are prepared where gold nanoparticles are not grown (minute 0) and grown for 4 minutes, 10 minutes and 12 minutes. After the dataset preparation, these spectrum measurements are transferred into MATLAB environment as sucrose concentration dataset and bacteria solution dataset. Then the necessary preprocessing steps are performed in order to get the most informative and distinguishing information from these datasets. The raw measurement values and processed spectrum measurements are trained with shallow Artificial Neural Networks (ANN) on MATLAB Deep Learning Toolbox and Support Vector Machine (SVM) on MATLAB Statistics and Machine Learning Toolbox. When the results of the conducted machine learning experiments are examined, success rate is promising for the estimation of sucrose concentrations and very high for classification of bacteria concentrations in pure water solution.
