Thquad: Turkish Historic Question Answering Dataset for Reading Comprehension
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Question answering(QA) is a field in natural language processing and information retrieval, it aims to give answers to the questions using natural language. In this paper, we present the Turkish question answering dataset, which is THQuAD and baseline results with contextualized word embeddings. THQuAD consists of two different datasets one of them is TQuad on Turkish Islamic Science history within the scope of Teknofest 2018 "Artificial Intelligence competition", the second dataset on Ottoman history within the scope of Teknofest 2020 "Dogal Dil íçleme Yarismasi" prepared by us. THQuAD is a reading comprehension dataset, consisting of questions, answers, and passages. Our objective is to give an answer to a specific question by understanding the passage and extracting the answer from this passage. We generate contextualized word embeddings from pre-trained Turkish Bert, Electra, Albert language models after fine-tuning on different hyperparameters with neural networks. © 2021 IEEE
Description
Keywords
Contextualized word embeddings, Deep learning, Information retrieval, Natural language understanding, Question answering
Fields of Science
0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology
Citation
WoS Q
Scopus Q

OpenCitations Citation Count
9
Volume
Issue
Start Page
215
End Page
220
PlumX Metrics
Citations
Scopus : 19
Captures
Mendeley Readers : 17
SCOPUS™ Citations
19
checked on Apr 27, 2026
Page Views
75
checked on Apr 27, 2026
Google Scholar™



