Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7148
Browse
3 results
Search Results
Article Citation - Scopus: 2Turkmednli: a Turkish Medical Natural Language Inference Dataset Through Large Language Model Based Translation(Peerj inc, 2025) Ogul, Iskender Ulgen; Soygazi, Fatih; Bostanoglu, Belgin ErgencNatural language inference (NLI) is a subfield of natural language processing (NLP) that aims to identify the contextual relationship between premise and hypothesis sentences. While high-resource languages like English benefit from robust and rich NLI datasets, creating similar datasets for low-resource languages is challenging due to the cost and complexity of manual annotation. Although translation of existing datasets offers a practical solution, direct translation of domain-specific datasets presents unique challenges, particularly in handling abbreviations, metric conversions, and cultural alignment. This study introduces a pipeline for translating a medical NLI dataset into Turkish, which is a low-resource language. Our approach employs fine-tuning the Llama-3.1 model with selected samples from the Medical Abbreviation dataset (MeDAL) to extract and resolve medical abbreviations. Consequently, NLI pairs are refined with extracted abbreviations and subjected to metric correction. Later, the processed sentences are then translated using Facebook's No Language Left Behind (NLLB) translation model. To ensure quality, we conducted comprehensive evaluations using both machine learning models and medical expert review. Our results show that BERTurk achieved 75.17% accuracy on TurkMedNLI test data and 76.30% on the normalized test set, while BioBERTurk demonstrated comparable performance with 75.59% accuracy on test data and 72.29% on the normalized dataset. Medical experts further validated the translations through manual assessment of sampled sentences. This work demonstrates the effectiveness of large language models in adapting domain-specific datasets for low-resource languages, establishing a foundation for future research in multilingual biomedical NLP.Article Citation - WoS: 2Citation - Scopus: 2Enrichment of Turkish Question Answering Systems Using Knowledge Graphs(Tubitak Scientific & Technological Research Council Turkey, 2024) Ciftci, Okan; Soygazi, Fatih; Tekir, SelmaRecent capabilities of large language models (LLMs) have transformed many tasks in Natural Language Processing (NLP), including question answering. The state-of-the-art systems do an excellent job of responding in a relevant, persuasive way but cannot guarantee factuality. Knowledge graphs, representing facts as triplets, can be valuable for avoiding errors and inconsistencies with real-world facts. This work introduces a knowledge graph-based approach to Turkish question answering. The proposed approach aims to develop a methodology capable of drawing inferences from a knowledge graph to answer complex multihop questions. We construct the Beyazperde Movie Knowledge Graph (BPMovieKG) and the Turkish Movie Question Answering dataset (TRMQA) to answer questions in the movie domain. We evaluate our proposed question answering pipeline against a baseline study. Furthermore, we compare it with a question answering system built upon GPT-3.5 Turbo to answer the 1-hop questions from TRMQA. The experimental results confirm that link prediction on a knowledge graph is quite effective in answering questions that require reasoning paths. Finally, we provide insights into the pros and cons of the provided solution through a qualitative study.Article Citation - Scopus: 1An Interestingness Measure for Knowledge Bases(Elsevier, 2023) Oğuz, Damla; Soygazi, FatihAssociation rule mining and logical rule mining both aim to discover interesting relationships in data or knowledge. In association rule mining, relationships are identified based on the occurrence of items in a dataset, while in logical rule mining, relationships are determined based on logical relationships between atoms in a knowledge base. Association rule mining has been widely studied in transactional databases, mainly for market basket analysis. Confidence has become the most widely used interesting measure to assess the strength of a rule. Many other interestingness measures have been proposed since confidence can be insufficient to filter negatively associated relationships. Recently, logical rule mining has become an important area of research, as new facts can be inferred by applying discovered logical rules. They can be used for reasoning, identifying potential errors in knowledge bases, and to better understand data. However, there are currently only a few measures for logical rule mining. Furthermore, current measures do not consider relations that can have several objects, called quasi-functions, which can dramatically alter the interestingness of the rule. In this paper, we focus on effectively assessing the strength of logical rules. We propose a new interestingness measure that takes into account two categories of relations, functions and quasi-functions, to assess the degree of certainty of logical rules. We compare our proposed measure with a widely used measure on both synthetic test data and real knowledge bases. We show that it is more effective in indicating rule quality, making it an appropriate interestingness measure for logical rule evaluation. & COPY; 2023 Karabuk University. Publishing services by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
