GCRIS

Now showing 1 - 4 of 4

Combining Persona and Argument in Dialogue
(2024) Güzel, Şükrü; Tekir, Selma
The increasing popularity of personalized dialogue systems has gained momentum as people's desire for human-like interaction grows. This thesis aims to increase persona-consistent responses in personalized dialogue systems. A data augmentation method was used to enhance the persona consistency of dialogue systems. This technique utilized Large Language Models' few-shot learning capabilities to add counterfactual sentences to the dialogue. GPT 3.5 and Llama 2 models were used to generate counterfactual sentences using the few-shot prompting method. The augmentation method was applied to every dialogue in the PersonaChat dataset that did not originally contain a counterfactual sentence. Evaluation using the state-of-the-art personalized dialogue generation study showed that the persona-consistency results of the dataset augmented with the GPT 3.5 model showed better performance when assessed using metrics.
Learning Citation-Aware Representations for Scientific Papers
(01. Izmir Institute of Technology, 2024) Çelik, Ege Yiğit; Tekir, Selma
In the field of Natural Language Processing (NLP), the tasks of understanding and generating scientific documents are highly challenging and have been extensively studied. Comprehending scientific papers can facilitate the generation of their contents. Similarly, understanding the relationships between scientific papers and their citations can be instrumental in generating and predicting citations within the text of scientific works. Moreover, language models equipped with citation-aware representations can be particularly robust for downstream tasks involving scientific literature. This thesis aims to enhance the accuracy of citation predictions within scientific texts. To achieve this, we hide citations within the context of scientific papers using mask tokens and subsequently pre-train the RoBERTa-base language model to predict citations for these masked tokens. We ensure that each citation is treated as a single token to be predicted by the mask-filling language model. Consequently, our models function as language models with citation-aware representations. Furthermore, we propose two alternative techniques for our approach. Our base technique predicts citations using only the contexts from scientific papers, while our global technique incorporates the titles and abstracts of papers alongside the contexts to improve performance. Experimental results demonstrate that our models significantly surpass the state-of-the-art results on two out of four benchmark datasets. However, for the remaining two datasets, our models yield suboptimal results, indicating potential for further improvement. Additionally, we conducted experiments on sampled datasets to examine the effects of inherent factors on the datasets and to identify correlations between these factors and our results.
Transformers Using Local Attention Mappings for Long Text Document Classification
(2023) Haman, Bekir Ufuk; Tekir, Selma
Transformer models are powerful and flexible encoder-decoder structures that have proven their success in many fields, including natural language processing. Although they are especially successful in working with textual input, classifying texts, answering questions, and producing text, they have difficulty processing long texts. Current leading transformer models such as BERT limit input lengths to 512 tokens. The most prominent reason for this limitation is that the self-attention operation, which forms the backbone of the transformer structure, requires high processing power. This processing power requirement, which increases quadratically with the input length, makes it impossible for transformers to process long texts. However, new transformer structures that use various local attention mapping methods have begun to be proposed to overcome the text length challenge. This study first proposes two alternative local attention mapping methods to make transformer models capable of processing long texts. In addition, it presents the 'Refined Patents' dataset consisting of 200,000 patent documents, specifically prepared for the long text document classification task. The proposed attention mapping methods, Term Frequency - Inverse Document Frequency (TF-IDF) and Point Mutual Information (PMI), create a sparse version of the self-attention matrix based on the occurrence statistics of words and word pairs. These methods were implemented based on the Longformer and Big Bird models, and tested on the Refined Patents dataset. Test results show that both proposed approaches are acceptable local attention mapping alternatives and can be used to enable long text processing in transformers.
An Event-Based Hidden Makrov Model Approach To News Classification and Sequencing
(Izmir Institute of Technology, 2014) Çavuş, Engin; Tekir, Selma
Over the past years the number of published news articles have an excessive increase. In the past, there was less channel of communication. Moreover the articles were classified by the human operators. In the course of time the means of the communication increased and expanded rapidly. The need for an automated news classification tool is inevitable. The text classification is a statistical machine learning procedure that individual text items are placed into groups based on quantitative information. In this study, an event based news classification and sequencing system is proposed, the model is explained. The decision making process is represented. A case study is prepared and analyzed.

Master Degree / Yüksek Lisans Tezleri

Browse

Filters

Settings

Sort By

Results per page

Search Results