Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 10 of 182
  • Master Thesis
    Estrus Detection in Cows With Deep Learning Techniques
    (01. Izmir Institute of Technology, 2024) Arıkan, İbrahim; Ayav, Tolga; Soygazi, Fatih
    Accurately predicting the estrus period is essential for enhancing the efficiency and lowering the costs of artificial insemination in livestock, a crucial sector for global food production. Precisely identifying the estrus period is critical to avoid economic losses such as decreased milk production, delayed calf births, and loss of eligibility for government subsidies. Since the most obvious movement that needs to be detected during the fertilization period is mounting, it is important to detect this movement. Since manual detection of this movement is difficult and costly, automated methods were needed. Therefore, it is thought that deep learning-based methods can be applied to detect the mounting moment. The proposed method detects the estrus period using deep learning and XAI (Explainable Artificial Intelligence) techniques. Deep learning-based mounting detection is performed using CNN, ResNet, VGG-19 and YOLO-v5 models. The ResNet model in this proposed study detects mounting movement with 99% accuracy. Explainability of deep learning models describes features that aid in decision-making in detecting mounting motion. Grad-CAM and Gradient Inputs models, which are XAI techniques, are used for the black box behind the proposed models. The developed deep learning models reveal that they focus on the udder and back area of the cows during the decision-making phase. In addition, how successfully the Grad-CAM and Gradient Inputs models, which are the XAI models used for the explainability of the deep learning models trained in this study, performed the explanation process was measured by calculating the 'faithfulness', 'maximum sensitivity' and 'complexity' metrics.
  • Master Thesis
    Combining Persona and Argument in Dialogue
    (2024) Güzel, Şükrü; Tekir, Selma
    The increasing popularity of personalized dialogue systems has gained momentum as people's desire for human-like interaction grows. This thesis aims to increase persona-consistent responses in personalized dialogue systems. A data augmentation method was used to enhance the persona consistency of dialogue systems. This technique utilized Large Language Models' few-shot learning capabilities to add counterfactual sentences to the dialogue. GPT 3.5 and Llama 2 models were used to generate counterfactual sentences using the few-shot prompting method. The augmentation method was applied to every dialogue in the PersonaChat dataset that did not originally contain a counterfactual sentence. Evaluation using the state-of-the-art personalized dialogue generation study showed that the persona-consistency results of the dataset augmented with the GPT 3.5 model showed better performance when assessed using metrics.
  • Master Thesis
    Predicting Software Size From Requirements Written in Natural Language: a Generative Ai Approach
    (01. Izmir Institute of Technology, 2024) Kennouche, Dhıa Eddıne; Demirörs, Onur
    In project management, software size measurement represents a critical process aimed at visualizing a project. This quantification is pursued independently of the specific technologies or technical decisions adopted during the project's development phase. Among the various methodologies employed for this purpose, the COSMIC Functional Size Measurement (FSM) and Event Points are used to facilitate such assessments. These methodologies are instrumental in offering a standardized approach for measuring software size, yet they inherently demand a considerable amount of manual effort. Furthermore, these methods require the manual extraction of Objects of Interest and Event Names, adding to the labor-intensive nature of the process. In response to these challenges, this thesis implements a suite of Artificial Intelligence (AI)-based methods that have dramatically transformed the measurement process. These innovative approaches encompass the creation of a Regression Model that predicts software sizes with remarkable accuracy, a Summarization Model that automates the extraction of Event Names, and a finely tuned Large Language Model (LLM) that generates Objects of Interest with a significant precision. The adoption of these AI-driven techniques has proven to be highly successful, substantially minimizing the manual effort traditionally required in software size measurement and thereby greatly enhancing both efficiency and reliability of estimation practices. Together, these AI-based methodologies represent a significant advancement in software size measurements, offering a more streamlined and efficient approach. By reducing the reliance on manual processes, these methods not only enhance the accuracy and reliability of measurements but also contribute to a more agile project management environment.
  • Master Thesis
    Analysis of Test Smell Impact on Test Code Quality
    (01. Izmir Institute of Technology, 2024) Cebeci, İsmail; Tuğlular, Tuğkan
    Test Kokuları, test kodundaki kalıplardır ve mutlaka yanlış olmasa da, test kodunun sürdürülebilirliğini ve etkililiğini engelleyebilecek kötü tasarım seçimlerini önerir. Yazılım geliştirmede, programlamada daha derin sorunlara işaret eden kod kokuları kavramından kaynaklanan test kokuları, benzer şekilde otomatik test komut dosyalarındaki, yazılım test sürecinin güvenilirliğini ve netliğini tehlikeye atabilecek sorunlara işaret eder. Bu tez içinde en çok bilinen 2 araç kullanarak (JNose and TestSmellDetector), GitHub üzerinden erişilen 500 proje incelenmiştir. Belirtilen 500 adet projelerde Java dili kullanılmasına dikkat edildi. İncelenen projelerde bulunan bütün test dosyaları, kullanılan 2 araç için input olarak kullanılmıştır. Araçların çıktıları karşılaştırılarak, toplam kaç adet test kokusu bulunduğu, hangi aracın hangi test kokularını daha iyi tespit ettiğini, en çok hangi test kokularının test dosyalarına etki ettiğini, test kokularının birbiriyle olan ilişkileri ve meydana gelme şıklıkları araştırılmıştır. Sonuç olarak 'Assertion Roulette,' 'Magic Number Test,' ve 'Lazy Test,' iki araç içinde en yaygın test kokuları olarak elde edilmiştir. Ek olarak, JNose aracı kullanılarak en yüksek birlikte gerçekleşme oranları 'Koşullu Test Mantığı' ile 'Hevesli Test' ve 'İstisna Yakalama Fırlatma' ile 'Bilinmeyen Test' arasında gözlemlenmiştir. Öte yandan, TestSmellDetector Aracı kullanıldığında en yüksek birliktelik oranları 'Bilinmeyen Test' ile 'Hevesli Test' ile 'Kaynak İyimserliği' ve 'Gizemli Misafir' arasında gözlenmiştir. Bu sonuçlar kullanılarak, test dosyaları üzerinde yeniden düzenleme işlemleri için ne tür çalışmalar yapılması gerektiği kolaylıkla belirlenebilir.
  • Master Thesis
    Modeling Microservice Based Applications: Model Lives Inside Code Approach
    (01. Izmir Institute of Technology, 2024) Ersoy, Eyüp Fatih; Demirörs, Onur
    In today's software development, maintaining consistent documentation is crucial for sharing and preserving team knowledge. As projects grow more complex, developers need to quickly understand and maintain code. However, keeping documentation aligned with business logic without unnecessary technical details is challenging. Traditional visualization tools like UML, sequence, and activity diagrams focus on object-oriented approaches and often require manual updates, making them less suitable for event-based systems like microservices. To address these issues, the tool Docupyt was developed using eEPC (Extended Event Process Chains) as the main modeling approach. Docupyt is designed with three key principles: ease of use, simplicity (including only necessary logic), and reactivity (representing event-based systems). eEPC notation helps analyze problems and represent changing logic during development, accommodating fast-changing requirements. It supports both high and low-level process definitions and focuses on business logic without extraneous technical details. Generated directly from code through simple commenting, this approach simplifies updating documentation as the code changes, reducing maintenance costs. Using the design science research method, Docupyt was validated in a case study, demonstrating it is user-friendly and provides adequate detail without being overly technical. Its main advantage is keeping documentation in sync with code logic, easing updates.
  • Master Thesis
    Learning Citation-Aware Representations for Scientific Papers
    (01. Izmir Institute of Technology, 2024) Çelik, Ege Yiğit; Tekir, Selma
    In the field of Natural Language Processing (NLP), the tasks of understanding and generating scientific documents are highly challenging and have been extensively studied. Comprehending scientific papers can facilitate the generation of their contents. Similarly, understanding the relationships between scientific papers and their citations can be instrumental in generating and predicting citations within the text of scientific works. Moreover, language models equipped with citation-aware representations can be particularly robust for downstream tasks involving scientific literature. This thesis aims to enhance the accuracy of citation predictions within scientific texts. To achieve this, we hide citations within the context of scientific papers using mask tokens and subsequently pre-train the RoBERTa-base language model to predict citations for these masked tokens. We ensure that each citation is treated as a single token to be predicted by the mask-filling language model. Consequently, our models function as language models with citation-aware representations. Furthermore, we propose two alternative techniques for our approach. Our base technique predicts citations using only the contexts from scientific papers, while our global technique incorporates the titles and abstracts of papers alongside the contexts to improve performance. Experimental results demonstrate that our models significantly surpass the state-of-the-art results on two out of four benchmark datasets. However, for the remaining two datasets, our models yield suboptimal results, indicating potential for further improvement. Additionally, we conducted experiments on sampled datasets to examine the effects of inherent factors on the datasets and to identify correlations between these factors and our results.
  • Master Thesis
    Transformers Using Local Attention Mappings for Long Text Document Classification
    (2023) Haman, Bekir Ufuk; Tekir, Selma
    Transformer models are powerful and flexible encoder-decoder structures that have proven their success in many fields, including natural language processing. Although they are especially successful in working with textual input, classifying texts, answering questions, and producing text, they have difficulty processing long texts. Current leading transformer models such as BERT limit input lengths to 512 tokens. The most prominent reason for this limitation is that the self-attention operation, which forms the backbone of the transformer structure, requires high processing power. This processing power requirement, which increases quadratically with the input length, makes it impossible for transformers to process long texts. However, new transformer structures that use various local attention mapping methods have begun to be proposed to overcome the text length challenge. This study first proposes two alternative local attention mapping methods to make transformer models capable of processing long texts. In addition, it presents the 'Refined Patents' dataset consisting of 200,000 patent documents, specifically prepared for the long text document classification task. The proposed attention mapping methods, Term Frequency - Inverse Document Frequency (TF-IDF) and Point Mutual Information (PMI), create a sparse version of the self-attention matrix based on the occurrence statistics of words and word pairs. These methods were implemented based on the Longformer and Big Bird models, and tested on the Refined Patents dataset. Test results show that both proposed approaches are acceptable local attention mapping alternatives and can be used to enable long text processing in transformers.
  • Master Thesis
    Testing Microservice Applications
    (2023) Öztürk, Özgür; Ayav, Tolga; Demirörs, Onur
    This thesis contributes to the testing processes of microservice architecture. Microservices provide a scalable, reliable and cloud-based environment that is frequently preferred in today's technology applications. It consists of small, loosely coupled, isolated applications that work in harmony. In this study, microservice application is modeled using timed automata and model checker-based testing methods are exploited to generate test cases automatically. To this end, UPPAAL model checker tool is utilized. The model of the microservice application is mutated with respect to a set of fault hypotheses and these mutant models are verified against certain properties defined by system or application specifications. The returned counterexamples from the model checker are used to constitute the test cases. The entire process is automated and experimentally run for an example application. The generated test cases are also shown to be efficiently detect the errors. The proposed testing methodology has the benefits like a faster test generation process and achieving test cases with better fault detection capability
  • Master Thesis
    Community Detection on Gpu: a Comprehensive Analysis, Unified Memory Enhancement, and Memory Access Optimization
    (2023) Dinçer, Emre; Öz, Işıl
    Recent years have experienced a slowdown in the development of traditional systems that use only the Central Processing Unit (CPU). However, significant progress has been made in the development of heterogeneous systems utilizing not only the CPU but also the Graphics Processing Unit (GPU). NVIDIA, one of the GPU manufacturers, through its CUDA platform, has increased the interest of many researchers in heterogeneous systems by providing a means to program GPUs more easily. The ease of application development provided by the CUDA platform and the performance gains offered by these heterogeneous systems have encouraged many researchers to develop algorithms and applications that operate on these systems. One such algorithm that is frequently used in data analysis is the community detection algorithm. Although there are applications that implement this algorithm to run on GPUs, and while these applications work efficiently for many datasets, they either fail to work or experience significant performance loss for large datasets that exceed the GPU's memory capacity. In this thesis, we analyzed Rundemanen, which is one of the community detection applications running on GPU. We also made enhancements that enable Rundemanen to process datasets larger than the GPU's memory capacity by utilizing CUDA's Unified Memory. Lastly, we tested various optimization methods to use Unified Memory more efficiently. By using our memory-access advises, in comparison to the naive version, we obtained up to 62x and 8x performance gain with artificial oversubscription scenarios and for datasets that already do not fit into GPU memory, respectively.
  • Master Thesis
    Performance-Reliability Tradeoff Analysis for Safety-Critical Systems With Gpus
    (2023) Sezgin, Yağızcan; Öz, Işıl
    GPUs were mostly used for image processing purposes when they were first introduced. These applications can be considered non-critical, and they were not given sufficient importance for reliability. Due to the evolving nature of GPUs, they offer highly parallelized architecture and provide extremely powerful computation, they become one of the most crucial parts of the systems that have complex applications in safety-critical domains such as automotive and space to fulfill the high computational demand. In this thesis, we evaluate the performance and reliability tradeoff in the safety-critical domain. We propose software-based redundancy schemes with different spheres of replications on the GPU4S benchmark in the safety-critical domain. Our proposal includes profiling the baseline application without any redundancy, applying fault injection using NVBitFI and changing implementation manually according to proposed redundancy schemes, measuring performance metrics such as execution time, memory copy operations, and power consumption on the real hardware that is widely used on target domain instead of using well-known GPU simulators to see actual performance. We reveal that our proposed redundancy schemes are managed to eliminate all the soft errors in the cases if we apply full redundancy for single-kernel benchmarks, for the reliability evaluation with the cost of performance degradation, depending on the application. We show that most soft errors can be eliminated using partial redundancy for complex applications, with a small performance impact.