Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 7 of 7

Multi-Frame Super-Resolution Without Priors
(01. Izmir Institute of Technology, 2023) Gülmez, Veli; Özuysal, Mustafa
There are mainly two types of super-resolution methods: traditional methods and deep learning methods. While traditional methods define closed-form expressions with assumptions, deep learning methods rely on priors learned from data sets. However, both of them have disadvantages such as being too simple and having strong trust in priors. We focus on how to generate a high-resolution image using low-resolution images without priors by utilizing spatial hash encoding. We propose a grid-based super-resolution model using spatial hash encoding to map coordinate information into higher dimensional space. Our aim is to eliminate long training times and not rely on priors from data sets that are not able to cover all real-world scenarios. Therefore, our proposed model is able to do task- specific super-resolution without priors and eliminate potential hallucination effects caused by wrong priors.
Deep Learning Based Real-Time Sequential Facial Expression Analysisusing Geometric Features
(01. Izmir Institute of Technology, 2023) Köksal, Talha Enes; Gümüş, Abdurrahman
In this thesis, macro and micro facial expression sequences from various datasets are trained using neural networks to classify them in one of the basic emotions. In macro expression experiments, for each frame of the sequences facial landmarks are extracted using MediaPipe FaceMesh solution and geometric features using both spatial and temporal information based on these landmarks are created. To classify the features, ConvLSTM2D followed by multilayer perceptron blocks are used. In order to achieve real time classification performance, all algorithms are implemented compatible to run on GPU. The proposed method for macro expressions is tested with CK+, Oulu-CASIA VIS, Oulu-CASIA NIR and MMI datasets. In micro expression experiments, apart from geometric features also blendshape features provided by MediaPipe are used. In order to improve classification performance, Phase-Based Video Motion Processing technique is used to magnify subtle facial movements of micro expressions. Experiments are conducted separately on same classification layers that consist of ConvLSTM1D followed by multilayer perceptron blocks. The proposed method for micro expressions is tested with SAMM and CASME II datasets. The datasets utilized in this study were accessed upon signing corresponding license agreements. Each dataset is specifically designated for academic purposes and is made available under these agreements. Only data from subjects who provided consent for their information to be used in publications was included in the thesis. The license agreements for each dataset can be found in the appendices section.
Touch Gestures Classification by Deep Learning Methods
(Izmir Institute of Technology, 2022) Ege, Irmak; Altun, Kerem
In this study, we carried out social touch gesture classification on two publicly available datasets, Corpus of Social Touch (CoST) and Human-Animal Affective Robot Touch (HAART), and our demo dataset. In order to classify touch gesture datasets, four different models are proposed: 3-dimensional convolutional neural network (3D-CNN), 3-dimensional convolutional-long term short term memory neural network (3D-CNNLSTM), 3-dimensional convolutional-bidirectional long term short term memory neural network (3D-CNN-BiLSTM) + and 3-dimensional convolutional transformers network (3D-CNN-Transformer). The fundamental layer of the proposed deep neural network architectures is 3-dimensional convolution layer that enables to extract spatio-temporal features of touch gestures. In this regard, with the use of spatio-temporal features of touch gestures, generalization performance of proposed four models have been improved using data augmentation techniques by applying randomly shift and rotation, and ensemble learning. Additionally, We also found out that Stochastic Gradient Descent (SGD) optimization algorithm has better generalization performance than Adaptive Moment Estimation (ADAM), which is used more frequently in deep learning. The accuracy of classification results of three dataset is investigated in terms of proposed model. The results showed that the proposed methods, especially ensemble classifier and the ensemble classifier with data augmentation, are beneficial for obtaining more generalizable learning algorithms. The scripts of deep neural network architecture are available upon request.
Classification of Contradictory Opinions in Text Using Deep Learning Methods
(01. Izmir Institute of Technology, 2020) Oğul, İskender Ülgen; Tekir, Selma
Natural language inference (NLI) problem aims to ensure consistency as well as accuracy of propositions while making sense of natural language. Natural language inference aims to classify the relationship between two given sentences as contradiction, entailment or neutrality. To accomplish the classification task, sentences or words must be translated into mathematical representations called vectors or embedding. Vectorization of a sentence is as important as the complexity of the classification model. In this study, both pre-trained (Glove, Fasttext, Word2Vec) and contextual word embedding methods (BERT) were used for comparison and acquire the best result. One of the natural language processing tasks NLI, is highly complex and requires solutions. Conventional machine learning methods are insufficient to carry out natural language processing solutions. Therefore, more advanced solutions are required. This study used deep learning methods to perform the classification task. Unlike conventional machine learning approaches, deep learning approaches reduce errors while increasing accuracy by repeating the data many times. Opinion sentences have complex grammatical structures that are difficult to classify. This study used Decomposable Attention and Enhanced LSTM for natural language inference to perform NLI classification task. Using the advanced LSTM deep learning method and Bert contextual vectors for natural language extraction on the SNLI dataset, an accuracy result 88.0% very close state of the art result 92.1% was obtained. In order to show the usability of the developed solution in different NLI tasks, an accuracy of 80.02% was obtained in the studies performed on the MNLI data set.
Synthetic Generation of Fingerprints
(Izmir Institute of Technology, 2020) İrtem, Emre; Erdoğmuş, Nesli
Fingerprints are unique to each person and they have been widely used and accepted for identification purposes by the society. Fingerprints can be captured by using ink and paper to get a print and then digitizing it or more recently by using specialized sensors. But in both cases, trained specialist supervision is mostly needed. Moreover, since fingerprints are personal information, they are protected by the laws on personal data protection. Therefore, collection/sharing of real fingerprints is difficult and illegal without the consent of their owner. On the otherhand, deep learning systems that are proven to be very successfull in many machine learning task, usually depend on very large training sets to achive high accuracies. In this study, to overcome the data hunger problem for training deep neural networks, synthetic fingerprints are generated by using model-based methods. For this purpose, firstly master fingerprint images are generated and next many impressions are derived from them by applying real-world degradations. The realism and the usability of synthetic fingerprints are tried and validated using a fingerprint classification system. For which, a deep neural networks are trained with and without the synthetically generated data. As a result of the experiments, it is shown that the generated fingerprint images are realistic enough to positively effect the classification results and that the usage of the synthetically generated fingerprints in training deep systems are promising.
Vehicle Type Classification With Deep Learning
(Izmir Institute of Technology, 2020) Yaraş, Neriman; Özuysal, Mustafa
In this thesis, we studied the vehicle type classification problem from several perspectives. We apply a deep learning technique with different parameters such as image size and the number of images in data sets to the classification of an image as one of the nine vehicle types. After choosing the most appropriate one among trained models, we convert the problem into a hierarchical tree classification problem so that it could be analyzed in three different tree hierarchies. Experiments are performed using three computational methods for calculating possibilities for each of the nine classes that correspond to the leaves of the hierarchical trees. These studies result in a conclusion that 0.762812 average accuracy is obtained when traditional arithmetic mean computation applied on the hierarchical tree with level-2 using the Stanford Dataset by 224 image size on ResNet34 architecture.
Container Damage Detection and Classification Using Container Images
(Izmir Institute of Technology, 2019) İmamoğlu, Zeynep; Tuğlular, Tuğkan; Baştanlar, Yalın
In the logistics sector, digital transformation is of great importance in terms of competition. In the present case, container warehouse entry / exit operations are carried out manually by the logistics personnel including container damage detection. During container warehouse entry / exit process, the process of detecting damaged containers is carried out by the personnel and several minutes are required to upload to the system. The aim of this thesis is to automate detection of damaged containers. This way, the mistakes made by the personnel in this stage will be eliminated and the process will be accelerated. In this thesis, we propose a machine learning method which detects damaged containers using the container images to perform statistical damaged / undamaged estimation. We modeled the problem as a binary classification problem, which considers a container as damaged or undamaged. The result obtained from the undertaken studies shows that there is no single best method for visual classification. It is shown how the dataset was created and how the parameters used in the layered structure impact the most suitable model could be created for this study.

Master Degree / Yüksek Lisans Tezleri

Browse

Filters

Settings

Sort By

Results per page

Search Results