Master Degree / Yüksek Lisans Tezleri
Permanent URI for this collectionhttps://hdl.handle.net/11147/3008
Browse
6 results
Search Results
Master Thesis Estimation of Low Sucrose Concentrations and Classification of Bacteria Concentrations With Machine Learning on Spectroscopic Data(Izmir Institute of Technology, 2019) Mezgil, Bahadır; Baştanlar, Yalın; Baştanlar, YalınSpectroscopy can be used to identify elements. In a similar way, there are recent studies that use optical spectroscopy to measure the material concentrations in chemical solutions. In this study, we employ machine learning techniques on collected ultraviolet-visible spectra to estimate the level of sucrose concentrations in solutions and to classify bacteria concentrations. Some metal nanoparticles are very sensitive to refraction index changes in the environment and this helps to detect small refraction index changes in the solution. In our study, gold nanoparticles are used and we benefited from this property to estimate sucrose concentrations. The samples in different low sucrose concentration solutions are obtained by mixing the sucrose measured with precision scales with pure water and then the UV-Vis spectrum of each sample is measured. For the bacteria concentration solutions, spectra for six different bacteria concentrations are captured. Spectra of the same solutions are also captured before adding the bacteria. For each of these solutions, four sets are prepared where gold nanoparticles are not grown (minute 0) and grown for 4 minutes, 10 minutes and 12 minutes. After the dataset preparation, these spectrum measurements are transferred into MATLAB environment as sucrose concentration dataset and bacteria solution dataset. Then the necessary preprocessing steps are performed in order to get the most informative and distinguishing information from these datasets. The raw measurement values and processed spectrum measurements are trained with shallow Artificial Neural Networks (ANN) on MATLAB Deep Learning Toolbox and Support Vector Machine (SVM) on MATLAB Statistics and Machine Learning Toolbox. When the results of the conducted machine learning experiments are examined, success rate is promising for the estimation of sucrose concentrations and very high for classification of bacteria concentrations in pure water solution.Master Thesis A Learning-Based Demand Classification Service With Using Xgboost in Institutional Area(Izmir Institute of Technology, 2019) Gürakın, Çağrı; Ayav, TolgaThis study, purposes to explain the development stages and methodology of data classification service that has a text-based adaptable programming interface. One of the successful classification algorithms, XGBoost, was preferred in the study. The dataset that is used in the study obtained by 'Digital Business Tracking Application' of a name anonymized company. The dataset is tested by using different classification algorithms and detailed performance evaluation was conducted. As a result, highest accuracy rate is obtained with 'Data Classification Service' which was developed by using XGBoost algorithm.Master Thesis Digital font generation using long short-term memory networks(Izmir Institute of Technology, 2019) Temizkan, Onur; Özuysal, MustafaLong Short-Term Memory (LSTM) Networks are powerful models to solve sequential problems in machine learning. Apart from their use on sequence classification, LSTMs are also used for sequence prediction. Predictive features of LSTMs have been used extensively to generate handwriting, music and several other types of sequences. Configuration and training of LSTM networks are relatively more arduous than non-sequential models, especially when input data is complex. In this research, the aim is to train LSTM networks and its different variations, use their generative features on a relatively obscure and complex type of sequences in machine learning; digital fonts. Controlled experiments have been performed to find the effects of different model parameters, input encodings or network architectures on learning font based sequences. All in all, in this document; the procedure of creating a dataset from digital fonts are provided, training strategies are demonstrated and the generative results are discussed.Master Thesis Detection and Localization of Motorway Overhead Directional Signs by Convolutional Neural Networks Trained With Synthetic Images(Izmir Institute of Technology, 2019) Hekimgil, Hakan; Baştanlar, YalınImage classification, object detection and recognition have gone a long way in the last decade. The competitions, starting with ImageNet, have shown that various improving implementations of Artificial Neural Networks are the best Machine Learning techniques at the time for such tasks. However, machine learning methods require much training data and the such data for image related tasks come at a cost in terms of time and effort, if it can be obtained at all. When training data is scarce or not representative of the whole target set, synthetic data and data augmentation methods are used to increase the training data using what is already available. This thesis work shows that when the target classification images have a structure, even a loose one, it is still possible to use machine learning methods, deep learning in this case, without any real data to begin with and still produce a good detection model. In this work, a Convolutional Neural Network model is trained to detect and localize informative motorway lane direction signs. Starting with no real samples of the target images, a large computer-generated training set is created to train the model. The resulting detector can detect the required sign types with high accuracy, localizing their position by bounding boxes and categorizing them.Master Thesis Tag-Based Dynamic Ranking System for Organization Related News(Izmir Institute of Technology, 2018) Özkan, Mustafa Tunahan; Tuğlular, TuğkanIn information systems, tags are keywords or terms, which represent a piece of information. They provide to define an item and help it to be found again through searching or browsing. Tags have gained popularity due to the growth of social sharing, social bookmarking, organization network and social network websites. In addition, tags are also used to express prominent events and noticeable topics in the news. In this thesis, we propose a tag-based statistical learning approach to predict the shareability of news in an organization network. We represented features with tags by using different methods and adopted several classifiers to predict the shareability of news. We model this problem with a binary classification problem, where shareable news are considered as the positive and non-shareable news are considered as the negative class. The experimental results indicate that there is no general best classifier for the study of shareability prediction for organization related news but depending on the dataset and represented features we can adopt an optimal classifier.Master Thesis Cost and Benefit Analysis of Features Used in Machine Learning Based Pre-Mirna Detection(Izmir Institute of Technology, 2016) Suluyayla, Rabia; Allmaer, JensMicroRNAs (miRNAs) are short RNA molecules which play important roles in the post-trancriptional regulation of gene expression. Their transcription is followed by two RNA III endonuclease processing steps leading to mature miRNA formation. They are then incorporated into the RISC-complex which mediates mRNA targeting. Experimental miRNA prediction is difficult since detection relies on many factors therefore, computational methods have become indispensable. Therefore, machine learning methods rely on features describing precursor-miRNAs (pre-miRNAs) to be able to differentiate them from other hairpins in a genome. It is important to define feature groups which are informative, not highly correlated, and don’t incur a large computational cost in order to facilitate accurate miRNA detection. In this study for more than 800 pre-miRNA features the computational cost and benefit was analyzed. From these analyses five features (assl, lsr(%bp), lscm, asal and hpmfe rf I3), (four structural and one structuralthermodynamic one), which aren’t correlated, informative and are not computationally expensive are noticeable. Analyses are done with human hairpins, pseudo data; and a case study using the measles virus and the measles KEGG pathway genes. Overall calculation of human hairpins and measles virus took approximately 2 USD (United States Dollar) on Amazon web services. Supervised learning and random forest machine learning for miRNA prediction was applied and to two genes (TAB2 and BCC3) within the measles KEGG pathway and three hairpins were predicted. They were found to have human mature miRNA sequences embedded in them and their already annotated targets helped enlarge the KEGG measles pathway.
