WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Permanent URI for this collectionhttps://hdl.handle.net/11147/7150

Browse

Search Results

Now showing 1 - 3 of 3
  • Conference Object
    Citation - WoS: 10
    A Turkish Dataset for Gender Identification of Twitter Users
    (Assoc Computational Linguistics-ACL, 2019) Sezerer, Erhan; Polatbilek, Ozan; Tekir, Selma
    Author profiling is the identification of an author's gender, age, and language from his/her texts. With the increasing trend of using Twitter as a means to express thought, profiling the gender of an author from his/her tweets has become a challenge. Although several datasets in different languages have been released on this problem, there is still a need for multilingualism. In this work, we propose a dataset of tweets of Turkish Twitter users which are labeled with their gender information. The dataset has 3368 users in the training set and 1924 users in the test set where each user has 100 tweets. The dataset is publicly available(1).
  • Article
    Recognition of Counterfactual Statements in Turkish
    (Assoc Computing Machinery, 2025) Acar, Ali; Tekir, Selma
    Counterfactual statements are examples of causal reasoning as they describe events that did not happen and, optionally, those events' consequences if they happened. SemEval-2020 introduces the counterfactual detection (CFD) task and shares an English dataset. Since then, a set of datasets has been released in English, German, and Japanese as part of Amazon product reviews. This work releases the first Turkish corpus of counterfactuals (TRCD). The data collection process is driven by a clue phrase list of counterfactuals, mainly in the form of verb inflections in Turkish. We use clue phrase-based filtering to collect sentences from the Turkish National Corpus (TNC). On the other hand, half of the collection is subject to random word filtering to avoid selection bias due to clue phrases. After the human annotation process with an Inter Annotator Agreement of 0.65, we have 5000 sentences, of which 12.8% contain counterfactual statements. Furthermore, we provide a comprehensive baseline of transformer-based models by testing the effect of clue phrases, cross-lingual performance comparisons using the available CFD datasets, and zero-shot cross-lingual classification experiments using fine-tuning on the different combinations of the existing datasets. The results confirm that TRCD is compatible with the other CFD datasets. Moreover, fine-tuning a Turkish-specific model (BERTurk) performs better than the multilingual alternatives (mBERT and XLM-R). BERTurk is more robust to clue phrase masking. This result emphasizes the importance of a language-specific tokenizer for contextual understanding, especially for low-resource languages. Finally, our qualitative analysis gives insights into errors by different models.
  • Conference Object
    Information and Communication Technology Sector Strategy Map of Izmir
    (LookUs Scientific, 2013) Tuğlular, Tuğkan; Tekir, Selma; Velibeyoğlu, Koray
    This study aims to understand current dynamics of the Izmir's ICT sector by looking at its dynamics and mapping the spatial distribution of the firms. It is based on series of analysis produced for Izmir Development Agency in 2012 within the frame of preparation of 2014-2023 Izmir Regional Development Plan. It conducts a Delphi survey to support situation knowledge as well as trend prediction for the next 10 years' period. Furthermore, gap analysis is performed to measure the margin between the current situation of the ICT sector and future trends predicted by experts. The study also maps Izmir's ICT sector's location preferences based on Izmir Chamber of Commerce's publicly available web-based database. It illustrated that ICT sector's trend largely based on centripetal and spontaneously developed clusters placed in the central part of the city. On the other hand, planned technology regions and science parks are relatively immature and need to be developed. Within the light of this dichotomy this study proposes a strategy map to Izmir's ICT sector.