GCRIS

Now showing 1 - 2 of 2

Contrastive Retrieval Methodology for Turkish Metaphor Detection and Identification
(Assoc Computing Machinery, 2025) Inan, Emrah
Metaphorical expressions, as a form of figurative language, are individually limited in their use. However, whenboth literal and non-literal meanings are considered, they are frequently used in web content. Hence, producinga balanced dataset to learn superior representations is a challenging task, and metaphor detection suffers froma limited training dataset. To alleviate this problem, we present a retrieval-based contrastive learning approachwhich first identifies candidate metaphors in the input text and then detects metaphorical expressions as aclaim verification task in the inherently unbalanced setting of this study. Furthermore, we adapt contrastivelearning to make it easier to distinguish between the literal and figurative meanings of the same expression.For the experimental setup, we extract non-literal and literal expressions along with their meanings andsample sentences from a Turkish dictionary. In the metaphor detection subtask, performance evaluation shows that sparse and dense search variations using the Turkish-e5-Large model achieve a Recall@10 (R@10) scoreof 0.614. Moreover, the SimCSE-TR-Contr-Sample-Meaning model achieves the highest Recall@10 (R@10)of 0.9739 on the generated test dataset for the metaphor identification subtask. In the real-world scenario,it achieves a competitive R@10 score of 0.8684, and these results clearly demonstrate that our model cangeneralise to this real-world scenario
Recognition of Counterfactual Statements in Turkish
(Assoc Computing Machinery, 2025) Acar, Ali; Tekir, Selma
Counterfactual statements are examples of causal reasoning as they describe events that did not happen and, optionally, those events' consequences if they happened. SemEval-2020 introduces the counterfactual detection (CFD) task and shares an English dataset. Since then, a set of datasets has been released in English, German, and Japanese as part of Amazon product reviews. This work releases the first Turkish corpus of counterfactuals (TRCD). The data collection process is driven by a clue phrase list of counterfactuals, mainly in the form of verb inflections in Turkish. We use clue phrase-based filtering to collect sentences from the Turkish National Corpus (TNC). On the other hand, half of the collection is subject to random word filtering to avoid selection bias due to clue phrases. After the human annotation process with an Inter Annotator Agreement of 0.65, we have 5000 sentences, of which 12.8% contain counterfactual statements. Furthermore, we provide a comprehensive baseline of transformer-based models by testing the effect of clue phrases, cross-lingual performance comparisons using the available CFD datasets, and zero-shot cross-lingual classification experiments using fine-tuning on the different combinations of the existing datasets. The results confirm that TRCD is compatible with the other CFD datasets. Moreover, fine-tuning a Turkish-specific model (BERTurk) performs better than the multilingual alternatives (mBERT and XLM-R). BERTurk is more robust to clue phrase masking. This result emphasizes the importance of a language-specific tokenizer for contextual understanding, especially for low-resource languages. Finally, our qualitative analysis gives insights into errors by different models.

Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

Browse

Filters

Settings

Sort By

Results per page

Search Results