Master Degree / Yüksek Lisans Tezleri

Permanent URI for this collectionhttps://hdl.handle.net/11147/3008

Browse

Search Results

Now showing 1 - 10 of 23

Combining Persona and Argument in Dialogue
(2024) Güzel, Şükrü; Tekir, Selma
The increasing popularity of personalized dialogue systems has gained momentum as people's desire for human-like interaction grows. This thesis aims to increase persona-consistent responses in personalized dialogue systems. A data augmentation method was used to enhance the persona consistency of dialogue systems. This technique utilized Large Language Models' few-shot learning capabilities to add counterfactual sentences to the dialogue. GPT 3.5 and Llama 2 models were used to generate counterfactual sentences using the few-shot prompting method. The augmentation method was applied to every dialogue in the PersonaChat dataset that did not originally contain a counterfactual sentence. Evaluation using the state-of-the-art personalized dialogue generation study showed that the persona-consistency results of the dataset augmented with the GPT 3.5 model showed better performance when assessed using metrics.
Learning Citation-Aware Representations for Scientific Papers
(01. Izmir Institute of Technology, 2024) Çelik, Ege Yiğit; Tekir, Selma
In the field of Natural Language Processing (NLP), the tasks of understanding and generating scientific documents are highly challenging and have been extensively studied. Comprehending scientific papers can facilitate the generation of their contents. Similarly, understanding the relationships between scientific papers and their citations can be instrumental in generating and predicting citations within the text of scientific works. Moreover, language models equipped with citation-aware representations can be particularly robust for downstream tasks involving scientific literature. This thesis aims to enhance the accuracy of citation predictions within scientific texts. To achieve this, we hide citations within the context of scientific papers using mask tokens and subsequently pre-train the RoBERTa-base language model to predict citations for these masked tokens. We ensure that each citation is treated as a single token to be predicted by the mask-filling language model. Consequently, our models function as language models with citation-aware representations. Furthermore, we propose two alternative techniques for our approach. Our base technique predicts citations using only the contexts from scientific papers, while our global technique incorporates the titles and abstracts of papers alongside the contexts to improve performance. Experimental results demonstrate that our models significantly surpass the state-of-the-art results on two out of four benchmark datasets. However, for the remaining two datasets, our models yield suboptimal results, indicating potential for further improvement. Additionally, we conducted experiments on sampled datasets to examine the effects of inherent factors on the datasets and to identify correlations between these factors and our results.
Transformers Using Local Attention Mappings for Long Text Document Classification
(2023) Haman, Bekir Ufuk; Tekir, Selma
Transformer models are powerful and flexible encoder-decoder structures that have proven their success in many fields, including natural language processing. Although they are especially successful in working with textual input, classifying texts, answering questions, and producing text, they have difficulty processing long texts. Current leading transformer models such as BERT limit input lengths to 512 tokens. The most prominent reason for this limitation is that the self-attention operation, which forms the backbone of the transformer structure, requires high processing power. This processing power requirement, which increases quadratically with the input length, makes it impossible for transformers to process long texts. However, new transformer structures that use various local attention mapping methods have begun to be proposed to overcome the text length challenge. This study first proposes two alternative local attention mapping methods to make transformer models capable of processing long texts. In addition, it presents the 'Refined Patents' dataset consisting of 200,000 patent documents, specifically prepared for the long text document classification task. The proposed attention mapping methods, Term Frequency - Inverse Document Frequency (TF-IDF) and Point Mutual Information (PMI), create a sparse version of the self-attention matrix based on the occurrence statistics of words and word pairs. These methods were implemented based on the Longformer and Big Bird models, and tested on the Refined Patents dataset. Test results show that both proposed approaches are acceptable local attention mapping alternatives and can be used to enable long text processing in transformers.
Enrichment of Turkish Question Answering Systems Using Knowledge Graphs
(01. Izmir Institute of Technology, 2023) Çiftçi, Okan; Tekir, Selma; Soygazi, Fatih
In the era of digital communication, the ability to effectively process and interpret human language has become a key research area. Natural Language Processing (NLP) has emerged as a field that enables machines to better understand and analyze human language. One of the most important applications of NLP is the development of question answering systems, which are essential in various domains such as customer service, search engines, and chatbots. To answer incoming queries, question answering systems rely on knowledge graphs as a reliable source. This thesis proposes a Turkish Question Answering (TRQA) system that utilizes a knowledge graph. The research focuses on the automatic construction of a knowledge graph specific to the film industry, as well as the creation of a multi-hop question-answering dataset that can be queried from this graph. Building upon these constructions, we develop a deep learning based method for answering questions using the constructed knowledge graph. The constructed knowledge graph is compared with various knowledge graphs presented in the literature using DistMult, ComplEx and SimplE methods for the link prediction task. Additionally, the proposed question answering system is compared with the baseline study and compared with a generative large language model through quantitative and qualitative analyses.
Reproducibility Assessment of Research Code Repositories
(01. Izmir Institute of Technology, 2023) Akdeniz, Eyüp Kaan; Tekir, Selma
The growth in machine learning research has not been accompanied by a corresponding improvement in the reproducibility of the results. This thesis presents a novel, fully-automated end-to-end system that evaluates the reproducibility of machine learning studies based on the content of the associated GitHub project's Readme file. This evaluation relies on a readme template derived from an analysis of popular repositories. The template suggests a structure that promotes reproducibility. Our system generates a reproducibility score for each Readme file assessed, and it employs two distinct models, one based on section classification and the other on hierarchical transformers. The experimental outcomes indicate that the system based on section similarity outperforms the hierarchical transformer model. Furthermore, it has a superior edge concerning explainability, as it allows for a direct correlation of the scores with the respective sections of the Readme files. The proposed framework provides an important tool for improving the quality of code sharing and ultimately helps to increase reproducibility in machine learning research.
Recognition of Counterfactual Statements in Turkish
(01. Izmir Institute of Technology, 2023) Acar, Ali; Tekir, Selma
Counterfactual statements describe an event that did not happen or cannot happen, and optionally the consequence of this event if it would happen. Counterfactual statements are the building blocks of human thought processes as people constantly reflect upon past happenings and consider their future implications. Counterfactual reasoning is essential for machine intelligence and explainable artificial intelligence studies. Detecting counterfactuals automatically with machine learning algorithms is very crucial for these areas. This thesis presents the development of the first-ever Turkish counterfactual detection dataset. It presents a comprehensive classification baseline and expands the scope of counterfactual detection to include the Turkish language.
Automatic Quote Detection From Literary Work
(01. Izmir Institute of Technology, 2022) Güzel Altıntaş, Aybüke; Tekir, Selma
Literature inspires readers, and readers tend to share quotes from a literary work. The reader underlines the quotes in the book and shares them on social media, or on an online platform used by book readers. The definition of a quote is a span in a written text that is interesting for many readers and readers can use the quote in different contexts. In this study, a novel task in the field of Natural Language Processing is proposed: the Quote Detection Task. Also, an original dataset was formed from the Goodreads and Gutenberg websites with web scraping. Quotes are Goodreads data sourced from Kaggle and data that has been voted by 10 or more users are selected. These quotes have been validated with the books on the Project Gutenberg website. The final dataset consists of 4554 rows. The dataset contains quotes with their book spans. The span of a quote consists of the previous 10 sentences of the quote, the quote itself, and the following 10 sentences of the quote. Conditional Random Field (CRF) and Extractive Summarization as Text Matching (MatchSum) were run as two different baselines for quote detection. The Quote Detection Task is span detection that can be modeled with sequence labeling solutions and Neural extractive summarization systems in the literature. For this sequence tagging problem, the statistics-based CRF was run as first baseline. Extractive Summarization as Text Matching baseline is the second baseline chosen for the experimental part. Rouge-1 scores of 27.24% and 40.54%, respectively, were obtained from these baselines.
Classification of Contradictory Opinions in Text Using Deep Learning Methods
(01. Izmir Institute of Technology, 2020) Oğul, İskender Ülgen; Tekir, Selma
Natural language inference (NLI) problem aims to ensure consistency as well as accuracy of propositions while making sense of natural language. Natural language inference aims to classify the relationship between two given sentences as contradiction, entailment or neutrality. To accomplish the classification task, sentences or words must be translated into mathematical representations called vectors or embedding. Vectorization of a sentence is as important as the complexity of the classification model. In this study, both pre-trained (Glove, Fasttext, Word2Vec) and contextual word embedding methods (BERT) were used for comparison and acquire the best result. One of the natural language processing tasks NLI, is highly complex and requires solutions. Conventional machine learning methods are insufficient to carry out natural language processing solutions. Therefore, more advanced solutions are required. This study used deep learning methods to perform the classification task. Unlike conventional machine learning approaches, deep learning approaches reduce errors while increasing accuracy by repeating the data many times. Opinion sentences have complex grammatical structures that are difficult to classify. This study used Decomposable Attention and Enhanced LSTM for natural language inference to perform NLI classification task. Using the advanced LSTM deep learning method and Bert contextual vectors for natural language extraction on the SNLI dataset, an accuracy result 88.0% very close state of the art result 92.1% was obtained. In order to show the usability of the developed solution in different NLI tasks, an accuracy of 80.02% was obtained in the studies performed on the MNLI data set.
A Language Modeling Approach To Detect Bias
(Izmir Institute of Technology, 2020) Atik, Ceren; Tekir, Selma
Technology is developing day by day and is involved in every area of our lives. Technological innovations such as artificial intelligence can strengthen social biases that already exist in society, regardless of the developers' intentions. Therefore, researchers should be aware of this ethical issue. In this thesis, the effect of gender bias, which is one of the social biases, on occupation classification is investigated. For this, a new dataset was created by collecting obituaries from the New York Times website and they were handled in two different versions, with and without gender indicators. Since occupation and gender are independent variables, gender indicators should not have an impact on the occupation prediction of models. In this context, in order to investigate gender bias on occupation estimation, a model in which occupation and gender are learned together is evaluated as well as models that make only occupation classification are evaluated. The results obtained from models state that gender bias has a role in classification occupation.
Enriching Contextual Word Embeddings With Character Information
(Izmir Institute of Technology, 2020) Polatbilek, Ozan; Tekir, Selma
Natural Language Processing has become more and more popular with the recent advances in Artificial Intelligence. Fundamental improvements have been introduced in word representations to store semantic and/or syntactic features. With the recently published language model BERT, contextual word vectors could be generated. This model do not process character level information. In morphologically rich languages such as Turkish, this model's perception of syntax could be improved. In this thesis, a new model, called BERT-ELMo, which is a combination of BERT and ELMo, is proposed to enrich BERT with character level information. This model combines character level processing part of ELMo and contextual word representation part of the BERT model. To show the effectiveness of the proposed model, both quantitative (question answering) and qualitative (word analogy, word contextualization, morphological meaning, out of vocabulary word capturing) analyses are performed and it is compared with BERT on Turkish language. Thanks to character level addition, proposed model is able get trained in any language without any pre-analysis.To the best of our knowledge, this is the first study which uses morphological analysis to train the BERT model in Turkish, and the first model to integrate a character level module to BERT.

Master Degree / Yüksek Lisans Tezleri

Browse

Filters

Settings

Sort By

Results per page

Search Results