Thquad: Turkish Historic Question Answering Dataset for Reading Comprehension

dc.contributor.author Soygazi,F.
dc.contributor.author Çiftçi,O.
dc.contributor.author Kök,U.
dc.contributor.author Cengiz,S.
dc.date.accessioned 2024-09-24T15:55:49Z
dc.date.available 2024-09-24T15:55:49Z
dc.date.issued 2021
dc.description.abstract Question answering(QA) is a field in natural language processing and information retrieval, it aims to give answers to the questions using natural language. In this paper, we present the Turkish question answering dataset, which is THQuAD and baseline results with contextualized word embeddings. THQuAD consists of two different datasets one of them is TQuad on Turkish Islamic Science history within the scope of Teknofest 2018 "Artificial Intelligence competition", the second dataset on Ottoman history within the scope of Teknofest 2020 "Dogal Dil íçleme Yarismasi" prepared by us. THQuAD is a reading comprehension dataset, consisting of questions, answers, and passages. Our objective is to give an answer to a specific question by understanding the passage and extracting the answer from this passage. We generate contextualized word embeddings from pre-trained Turkish Bert, Electra, Albert language models after fine-tuning on different hyperparameters with neural networks. © 2021 IEEE en_US
dc.identifier.doi 10.1109/UBMK52708.2021.9559013
dc.identifier.isbn 978-166542908-5
dc.identifier.scopus 2-s2.0-85125851915
dc.identifier.uri https://doi.org/10.1109/UBMK52708.2021.9559013
dc.identifier.uri https://hdl.handle.net/11147/14785
dc.language.iso en en_US
dc.publisher Institute of Electrical and Electronics Engineers Inc. en_US
dc.relation.ispartof Proceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021 -- 6th International Conference on Computer Science and Engineering, UBMK 2021 -- 15 September 2021 through 17 September 2021 -- Ankara -- 176826 en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Contextualized word embeddings en_US
dc.subject Deep learning en_US
dc.subject Information retrieval en_US
dc.subject Natural language understanding en_US
dc.subject Question answering en_US
dc.title Thquad: Turkish Historic Question Answering Dataset for Reading Comprehension en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.scopusid 57220960947
gdc.author.scopusid 57456792900
gdc.author.scopusid 57478574400
gdc.author.scopusid 57478710900
gdc.bip.impulseclass C4
gdc.bip.influenceclass C5
gdc.bip.popularityclass C4
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.collaboration.industrial false
gdc.description.department Izmir Institute of Technology en_US
gdc.description.departmenttemp Soygazi F., Department of Computer Engineering, Aydin Adnan Menderes University, Aydin, Turkey; Çiftçi O., Department of Computer Engineering, Izmir Institute of Technology, Izmir, Turkey; Kök U., Department of Computer Engineering, Izmir Institute of Technology, Izmir, Turkey; Cengiz S., Department of Computer Engineering, Aydin Adnan Menderes University, Aydin, Turkey en_US
gdc.description.endpage 220 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality N/A
gdc.description.startpage 215 en_US
gdc.description.wosquality N/A
gdc.identifier.openalex W3206294784
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 5.0
gdc.oaire.influence 2.838411E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 6.6086843E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration National
gdc.openalex.fwci 1.47143648
gdc.openalex.normalizedpercentile 0.86
gdc.openalex.toppercent TOP 1%
gdc.opencitations.count 9
gdc.plumx.mendeley 17
gdc.plumx.scopuscites 19
gdc.scopus.citedcount 19
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4003-8abe-a4dfe192da5e

Files