Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 10 of 12

Reproducibility Assessment of Research Code Repositories
(01. Izmir Institute of Technology, 2023) Akdeniz, Eyüp Kaan; Tekir, Selma
The growth in machine learning research has not been accompanied by a corresponding improvement in the reproducibility of the results. This thesis presents a novel, fully-automated end-to-end system that evaluates the reproducibility of machine learning studies based on the content of the associated GitHub project's Readme file. This evaluation relies on a readme template derived from an analysis of popular repositories. The template suggests a structure that promotes reproducibility. Our system generates a reproducibility score for each Readme file assessed, and it employs two distinct models, one based on section classification and the other on hierarchical transformers. The experimental outcomes indicate that the system based on section similarity outperforms the hierarchical transformer model. Furthermore, it has a superior edge concerning explainability, as it allows for a direct correlation of the scores with the respective sections of the Readme files. The proposed framework provides an important tool for improving the quality of code sharing and ultimately helps to increase reproducibility in machine learning research.
Machine Learning Based Resource Allocation for Massive Mimo Systems
(01. Izmir Institute of Technology, 2023) Sevgi, Hüseyin Can; Özbek, Berna
Cell-free massive MIMO communication systems is a promising technology that uses access-points(APs) deployed throughout the coverage area instead of usual cellular systems with centralized BS to serve multiple users simultaneously. By exploiting the large number of antennas and adopting advanced signal processing techniques, cell-free massive MIMO can mitigate inter-user interference and enhance the overall system performance. Optimal power allocation plays a crucial role in maximizing the spectral and energy efficiency of wireless networks. By intelligently allocating transmit power to different users, a balance between maximizing the system throughput and minimizing the total energy consumption can be achieved. In addition, user-centric clustering(UCC) is also a key technique to improve the performance of cell-free massive MIMO systems. This technique aims to pair user equipments (UEs) with appropriate APs to facilitate efficient resource allocation and interference management. In this thesis, cell-free mMIMO communication system is investigated through user-centric clustering and power allocation. The power allocation optimization problem is formulated to maximize energy efficiency of cell-free mMIMO systems and solved by using interior-point algorithm. User-centric clustering algorithm is proposed by disabling the non-master APs that are serving only one user. This additional feature aims to reduce total power consumption of the system without sacrificing the advantages of the cell-free mMIMO communication systems. Additionally, we propose a machine learning(ML) approach to reduce the computation time required for power allocation optimization. Through extensive simulations, we demonstrate the effectiveness of the proposed algorithms in achieving significant gains in spectral and energy efficiency in cell-free massive MIMO systems. The results highlight the importance of optimal power allocation and user-centric clustering to design an efficient cell-free mMIMO systems through machine learning approach.
Recognition of Counterfactual Statements in Turkish
(01. Izmir Institute of Technology, 2023) Acar, Ali; Tekir, Selma
Counterfactual statements describe an event that did not happen or cannot happen, and optionally the consequence of this event if it would happen. Counterfactual statements are the building blocks of human thought processes as people constantly reflect upon past happenings and consider their future implications. Counterfactual reasoning is essential for machine intelligence and explainable artificial intelligence studies. Detecting counterfactuals automatically with machine learning algorithms is very crucial for these areas. This thesis presents the development of the first-ever Turkish counterfactual detection dataset. It presents a comprehensive classification baseline and expands the scope of counterfactual detection to include the Turkish language.
Quasi-Supervised Strategies for Compound-Protein Interaction Prediction [master Thesis]
(01. Izmir Institute of Technology, 2021) Çakı, Onur; Karaçalı, Bilge
In-silico prediction of compound-protein interaction using computational methods preserves its importance in various pharmacology applications because the wet-lab experiments are time-consuming, laborious and costly. Most machine learning methods proposed to that end approach this problem with supervised learning strategies in which known interactions are labeled as positive and the rest are labeled as negative. However, treating all unknown interactions as negative instances may lead to inaccuracies in real practice since some of the unknown interactions are bound to be positive interactions waiting to be identified as such. In this study, we propose to address this problem using the Quasi-Supervised Learning algorithm. In this framework, potential interactions are predicted by estimating the overlap between two datasets: a true positive dataset which consists of compound-protein pairs with known interactions and an unknown dataset which consists of all the remaining compound-protein pairs. The potential interactions are then identified as those in the unknown dataset that overlap with the interacting pairs in the true positive dataset in terms of the associated similarity structure between interacting pairs. Experimental results on GPCR and Nuclear Receptor datasets show that the proposed method can identify actual interactions from all possible combinations.
Classification of Contradictory Opinions in Text Using Deep Learning Methods
(01. Izmir Institute of Technology, 2020) Oğul, İskender Ülgen; Tekir, Selma
Natural language inference (NLI) problem aims to ensure consistency as well as accuracy of propositions while making sense of natural language. Natural language inference aims to classify the relationship between two given sentences as contradiction, entailment or neutrality. To accomplish the classification task, sentences or words must be translated into mathematical representations called vectors or embedding. Vectorization of a sentence is as important as the complexity of the classification model. In this study, both pre-trained (Glove, Fasttext, Word2Vec) and contextual word embedding methods (BERT) were used for comparison and acquire the best result. One of the natural language processing tasks NLI, is highly complex and requires solutions. Conventional machine learning methods are insufficient to carry out natural language processing solutions. Therefore, more advanced solutions are required. This study used deep learning methods to perform the classification task. Unlike conventional machine learning approaches, deep learning approaches reduce errors while increasing accuracy by repeating the data many times. Opinion sentences have complex grammatical structures that are difficult to classify. This study used Decomposable Attention and Enhanced LSTM for natural language inference to perform NLI classification task. Using the advanced LSTM deep learning method and Bert contextual vectors for natural language extraction on the SNLI dataset, an accuracy result 88.0% very close state of the art result 92.1% was obtained. In order to show the usability of the developed solution in different NLI tasks, an accuracy of 80.02% was obtained in the studies performed on the MNLI data set.
Predictive Maintenance for Smart Industry
(01. Izmir Institute of Technology, 2020) Asadzade, Asad; Ayav, Tolga
After the internet of things developed rapidly, it started to be used in many several industrial areas. Thanks to IoT, data that affect the health of any equipment or other important systems are collected. When these data are processed correctly, important information about the production process is obtained. For example, thanks to this data, systems based on machine learning are created to predict when various components will fail. Thus, maintenance operations are carried out before the component's breakdown, and replacement operations are performed if necessary. This strategy, called predictive maintenance, provides industries with advantages such as maximizing the life of components, reducing extra costs, and time saving. In this study, we applied ARF method, which is based on stream learning, on Turbofan Engine Degradation Simulation Datasets which are provided by NASA to estimate the remaining useful lifetime of jet engines. As a result, we mentioned about the advantages of streaming learning over batch learning and compared our results with other batch learning based studies which are applied on the same datasets.
Estimation of Low Sucrose Concentrations and Classification of Bacteria Concentrations With Machine Learning on Spectroscopic Data
(Izmir Institute of Technology, 2019) Mezgil, Bahadır; Baştanlar, Yalın; Baştanlar, Yalın
Spectroscopy can be used to identify elements. In a similar way, there are recent studies that use optical spectroscopy to measure the material concentrations in chemical solutions. In this study, we employ machine learning techniques on collected ultraviolet-visible spectra to estimate the level of sucrose concentrations in solutions and to classify bacteria concentrations. Some metal nanoparticles are very sensitive to refraction index changes in the environment and this helps to detect small refraction index changes in the solution. In our study, gold nanoparticles are used and we benefited from this property to estimate sucrose concentrations. The samples in different low sucrose concentration solutions are obtained by mixing the sucrose measured with precision scales with pure water and then the UV-Vis spectrum of each sample is measured. For the bacteria concentration solutions, spectra for six different bacteria concentrations are captured. Spectra of the same solutions are also captured before adding the bacteria. For each of these solutions, four sets are prepared where gold nanoparticles are not grown (minute 0) and grown for 4 minutes, 10 minutes and 12 minutes. After the dataset preparation, these spectrum measurements are transferred into MATLAB environment as sucrose concentration dataset and bacteria solution dataset. Then the necessary preprocessing steps are performed in order to get the most informative and distinguishing information from these datasets. The raw measurement values and processed spectrum measurements are trained with shallow Artificial Neural Networks (ANN) on MATLAB Deep Learning Toolbox and Support Vector Machine (SVM) on MATLAB Statistics and Machine Learning Toolbox. When the results of the conducted machine learning experiments are examined, success rate is promising for the estimation of sucrose concentrations and very high for classification of bacteria concentrations in pure water solution.
A Learning-Based Demand Classification Service With Using Xgboost in Institutional Area
(Izmir Institute of Technology, 2019) Gürakın, Çağrı; Ayav, Tolga
This study, purposes to explain the development stages and methodology of data classification service that has a text-based adaptable programming interface. One of the successful classification algorithms, XGBoost, was preferred in the study. The dataset that is used in the study obtained by 'Digital Business Tracking Application' of a name anonymized company. The dataset is tested by using different classification algorithms and detailed performance evaluation was conducted. As a result, highest accuracy rate is obtained with 'Data Classification Service' which was developed by using XGBoost algorithm.
Digital font generation using long short-term memory networks
(Izmir Institute of Technology, 2019) Temizkan, Onur; Özuysal, Mustafa
Long Short-Term Memory (LSTM) Networks are powerful models to solve sequential problems in machine learning. Apart from their use on sequence classification, LSTMs are also used for sequence prediction. Predictive features of LSTMs have been used extensively to generate handwriting, music and several other types of sequences. Configuration and training of LSTM networks are relatively more arduous than non-sequential models, especially when input data is complex. In this research, the aim is to train LSTM networks and its different variations, use their generative features on a relatively obscure and complex type of sequences in machine learning; digital fonts. Controlled experiments have been performed to find the effects of different model parameters, input encodings or network architectures on learning font based sequences. All in all, in this document; the procedure of creating a dataset from digital fonts are provided, training strategies are demonstrated and the generative results are discussed.
Detection and Localization of Motorway Overhead Directional Signs by Convolutional Neural Networks Trained With Synthetic Images
(Izmir Institute of Technology, 2019) Hekimgil, Hakan; Baştanlar, Yalın
Image classification, object detection and recognition have gone a long way in the last decade. The competitions, starting with ImageNet, have shown that various improving implementations of Artificial Neural Networks are the best Machine Learning techniques at the time for such tasks. However, machine learning methods require much training data and the such data for image related tasks come at a cost in terms of time and effort, if it can be obtained at all. When training data is scarce or not representative of the whole target set, synthetic data and data augmentation methods are used to increase the training data using what is already available. This thesis work shows that when the target classification images have a structure, even a loose one, it is still possible to use machine learning methods, deep learning in this case, without any real data to begin with and still produce a good detection model. In this work, a Convolutional Neural Network model is trained to detect and localize informative motorway lane direction signs. Starting with no real samples of the target images, a large computer-generated training set is created to train the model. The resulting detector can detect the required sign types with high accuracy, localizing their position by bounding boxes and categorizing them.

Master Degree / Yüksek Lisans Tezleri

Browse

Filters

Settings

Sort By

Results per page

Search Results