A Turkish Dataset for Gender Identification of Twitter Users

Loading...

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Assoc Computational Linguistics-ACL

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Author profiling is the identification of an author's gender, age, and language from his/her texts. With the increasing trend of using Twitter as a means to express thought, profiling the gender of an author from his/her tweets has become a challenge. Although several datasets in different languages have been released on this problem, there is still a need for multilingualism. In this work, we propose a dataset of tweets of Turkish Twitter users which are labeled with their gender information. The dataset has 3368 users in the training set and 1924 users in the test set where each user has 100 tweets. The dataset is publicly available(1).

Description

Keywords

Fields of Science

Citation

WoS Q

N/A

Scopus Q

N/A

Source

13th Linguistic Annotation Workshop (LAW) -- Aug 01, 2019 -- Florence, Italy

Volume

Issue

Start Page

203

End Page

207
Web of Science™ Citations

10

checked on Apr 27, 2026

Page Views

30

checked on Apr 27, 2026

Google Scholar Logo
Google Scholar™

Sustainable Development Goals

SDG data could not be loaded because of an error. Please refresh the page or try again later.