Sezerer, Erhan

Loading...
Name Variants
Sezerer, E
Sezerer, E.
Job Title
Email Address
Main Affiliation
03.04. Department of Computer Engineering
Status
Former Staff
Website
ORCID ID
Scopus Author ID
Turkish CoHE Profile ID
Google Scholar ID
WoS Researcher ID

Sustainable Development Goals

NO POVERTY1
NO POVERTY
0
Research Products
ZERO HUNGER2
ZERO HUNGER
0
Research Products
GOOD HEALTH AND WELL-BEING3
GOOD HEALTH AND WELL-BEING
0
Research Products
QUALITY EDUCATION4
QUALITY EDUCATION
1
Research Products
GENDER EQUALITY5
GENDER EQUALITY
0
Research Products
CLEAN WATER AND SANITATION6
CLEAN WATER AND SANITATION
0
Research Products
AFFORDABLE AND CLEAN ENERGY7
AFFORDABLE AND CLEAN ENERGY
0
Research Products
DECENT WORK AND ECONOMIC GROWTH8
DECENT WORK AND ECONOMIC GROWTH
0
Research Products
INDUSTRY, INNOVATION AND INFRASTRUCTURE9
INDUSTRY, INNOVATION AND INFRASTRUCTURE
2
Research Products
REDUCED INEQUALITIES10
REDUCED INEQUALITIES
0
Research Products
SUSTAINABLE CITIES AND COMMUNITIES11
SUSTAINABLE CITIES AND COMMUNITIES
0
Research Products
RESPONSIBLE CONSUMPTION AND PRODUCTION12
RESPONSIBLE CONSUMPTION AND PRODUCTION
0
Research Products
CLIMATE ACTION13
CLIMATE ACTION
0
Research Products
LIFE BELOW WATER14
LIFE BELOW WATER
0
Research Products
LIFE ON LAND15
LIFE ON LAND
0
Research Products
PEACE, JUSTICE AND STRONG INSTITUTIONS16
PEACE, JUSTICE AND STRONG INSTITUTIONS
0
Research Products
PARTNERSHIPS FOR THE GOALS17
PARTNERSHIPS FOR THE GOALS
0
Research Products
Documents

8

Citations

31

h-index

4

WoS data could not be loaded because of an error. Please refresh the page or try again later.
Scholarly Output

10

Articles

3

Views / Downloads

17563/4227

Supervised MSc Theses

1

Supervised PhD Theses

1

WoS Citation Count

16

Scopus Citation Count

15

Patents

0

Projects

0

WoS Citations per Publication

1.60

Scopus Citations per Publication

1.50

Open Access Source

8

Supervised Theses

2

JournalCount
13th Linguistic Annotation Workshop (LAW) -- Aug 01, 2019 -- Florence, Italy1
19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 20181
2020 28th Signal Processing and Communications Applications Conference, SIU 2020 - Proceedings1
27th Signal Processing and Communications Applications Conference, SIU 20191
Applied Sciences1
Current Page: 1 / 2

Scopus Quartile Distribution

Competency Cloud

GCRIS Competency Cloud

Scholarly Output Search Results

Now showing 1 - 10 of 10
  • Conference Object
    Citation - Scopus: 6
    Gender Prediction From Tweets With Convolutional Neural Networks: Notebook for Pan at Clef 2018
    (CEUR Workshop Proceedings, 2018) Sezerer, Erhan; Polatbilek, Ozan; Sevgili, Özge; Tekir, Selma
    This paper presents a system1 developed for the author profiling task of PAN at CLEF 2018. The system utilizes style-based features to predict the gender information from the given tweets of each user. These features are automatically extracted by Convolutional Neural Networks (CNN). The system mainly depends on the idea that the informativeness of each tweet is not the same in terms of the gender of a user. Thus, the attention mechanism is included to the CNN outputs in order to discriminate the tweets carrying more information. Our architecture was able to obtain competitive results on three languages provided by the PAN 2018 author profiling challenge with an average accuracy of 75.1% on local runs and 70.23% on the submission run.
  • Conference Object
    Citation - Scopus: 1
    Türkçe Tweetler Üzerinden Yapay Sinir Ağları ile Cinsiyet Tahminlemesi
    (Institute of Electrical and Electronics Engineers Inc., 2019) Sezerer, Erhan; Polatbilek, Ozan; Tekir, Selma
    Yazar ayrımlaması, yazarı bilinmeyen bir metin üzerinden yazarına dair cinsiyet, yaş ve dil gibi bazı anahtar özniteliklerin belirlenmesidir. Özellikle güvenlik ve pazarlama alanında önem arz etmektedir. Bu çalışmada, kullanıcıların tweetleri kullanılarak cinsiyetleri tahminlenmektedir. Yinelemeli Sinir Ağı (YSA) ve ilgi mekanizmasının birleşiminden oluşan bir model önerilmiştir. Bildiğimiz kadarıyla bu çalışma Twitter veri kümesi ile Türkçe’de ilk defa yapılmıştır. Önerilen model Türkçe, İngilizce, İspanyolca ve Arapça dillerinde sınanmış ve sırasıyla 80.63, 81.73, 78.22, 78.5 doğruluk değerlerine ulaşılmıştır. Elde edilen doğruluk değerleri Türkçe’de en gelişkin, diğer dillerde ise rekabetçi bir başarım ortaya koymaktadır.
  • Conference Object
    Citation - WoS: 10
    A Turkish Dataset for Gender Identification of Twitter Users
    (Assoc Computational Linguistics-ACL, 2019) Sezerer, Erhan; Polatbilek, Ozan; Tekir, Selma
    Author profiling is the identification of an author's gender, age, and language from his/her texts. With the increasing trend of using Twitter as a means to express thought, profiling the gender of an author from his/her tweets has become a challenge. Although several datasets in different languages have been released on this problem, there is still a need for multilingualism. In this work, we propose a dataset of tweets of Turkish Twitter users which are labeled with their gender information. The dataset has 3368 users in the training set and 1924 users in the test set where each user has 100 tweets. The dataset is publicly available(1).
  • Conference Object
    Citation - WoS: 2
    Citation - Scopus: 4
    Çok-etiketli Film Türü Sınıflandırması için Türkçe Konu Modellemesi Veri Kümesi
    (Institute of Electrical and Electronics Engineers, 2020) Jabrayilzade, Elgün; Poyraz Arslan, Algın; Para, Hasan; Polatbilek, Ozan; Sezerer, Erhan; Tekir, Selma
    Statistical topic modeling aims to assign topics to documents in an unsupervised way. Latent Dirichlet Allocation (LDA) is the standard model for topic modeling. It shows good performance on document collections, documents being relatively long texts but it has poor performance on short texts. Topic modeling on short texts is on the rise due to the potential of social media. Thus, approaches that are able to nd topics on short texts as well as long texts are sought. However, there is a lack of datasets that include both long and short texts which have the same ground-truth categories. In this work, we release a Turkish movie dataset which contain both short lm descriptions and long subscripts where lm genre can be considered as topic. Furthermore, we provide multi-label movie genre classication results using a Feed Forward Neural Network (FFNN) taking LDA document-topic or Doc2Vec dense representations. © 2020 IEEE.
  • Article
    Citation - WoS: 2
    Citation - Scopus: 2
    Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning
    (MDPI, 2021) Sezerer, Erhan; Tekir, Selma
    Over the last few years, there has been an increase in the studies that consider experiential (visual) information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans starts with learning concrete concepts through images and then continues with learning abstract ideas through the text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through images and their corresponding captions to accomplish multi-modal language modeling/representation. We use the BERT and Resnet-152 models on each modality and combine them using attentive pooling to perform pre-training on the newly constructed dataset, which is collected from the Wikimedia Commons based on concrete/abstract words. To show the performance of the proposed model, downstream tasks and ablation studies are performed. The contribution of this work is two-fold: A new dataset is constructed from Wikimedia Commons based on concrete/abstract words, and a new multi-modal pre-training approach based on curriculum learning is proposed. The results show that the proposed multi-modal pre-training approach contributes to the success of the model.
  • Doctoral Thesis
    Discovering Specific Semantic Relations Among Words Using Neural Network Methods
    (Izmir Institute of Technology, 2021) Sezerer, Erhan; Tekir, Selma
    Human-level language understanding is one of the oldest challenges in computer science. Many scientific work has been dedicated to finding good representations for semantic units (words, morphemes, characters) in languages. Recently, contextual language models, such as BERT and its variants, showed great success in downstream natural language processing tasks with the use of masked language modelling and transformer structures. Although these methods solve many problems in this domain and are proved to be useful, they still lack one crucial aspect of the language acquisition in humans: Experiential (visual) information. Over the last few years, there has been an increase in the studies that consider experiential information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans start with learning concrete concepts through images and then continue with learning abstract ideas through text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through the use of images and corresponding captions to accomplish the task of multi-modal language modeling/representation. BERT and Resnet-152 model is used on each modality with attentive pooling mechanism on the newly constructed dataset, collected from the Wikimedia Commons. To show the performance of the proposed model, downstream tasks and ablation studies are performed. Contribution of this work is two-fold: a new dataset is constructed from Wikimedia Commons and a new multi-modal pre-training approach that is based on curriculum learning is proposed. Results show that the proposed multi-modal pre-training approach increases the success of the model.
  • Article
    Gender Prediction From Tweets: Improving Neural Representations With Hand-Crafted Features
    (Cornell University, 2019) Tekir, Selma; Sezerer, Erhan; Polatbilek, Ozan
    Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn ’where to look’. This model1 is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic.
  • Master Thesis
    News Story Analysis With Credibility Assessment by Opinion Mining
    (Izmir Institute of Technology, 2015) Sezerer, Erhan; Tekir, Selma
    With the growing influence of media and the popularity and widespread use of social networks, credibility of the news sources became an important subject that needs more attention. The biggest problem of finding credible sources is, instead of giving every aspect of the incident, news sources tend to accept one of the parties’ idea as a whole while rejecting every other ideas, or even worse, they focus on only one side of the incident and ignoring the rest. Credibility is defined as “The quality of believable and trustworthy”. The notion of trustworthiness can further be decomposed into components like bias, fairness, factual/ opinionated, etc. In this thesis, credibility is measured using the fact/opinion ratio of the articles. Two methods, which are the traditional Naive Bayes method and the Relativistic method, are proposed. The intuition of relativistic method comes from the theory of relativity where the sentiment of the articles is determined relatively to the ordinary context used by people in daily speech. We have tested our methods on four different types of data, hand-written articles, editorials, New York Times articles and Reuters articles, and aimed to show that our proposed models are able to differentiate the sentiments in the articles. In the experimental work, we provided a detailed evaluation of the results.
  • Article
    Citation - WoS: 1
    Citation - Scopus: 1
    Author Reputation Measurement on Question and Answer Sites by the Classification of Author-Generated Content
    (World Scientific Publishing, 2021) Sezerer, Erhan; Tenekeci, Samet; Acar, Ali; Baloğlu, Bora; Tekir, Selma
    In the field of software engineering, practitioners' share in the constructed knowledge cannot be underestimated and is mostly in the form of grey literature (GL). GL is a valuable resource though it is subjective and lacks an objective quality assurance methodology. In this paper, a quality assessment scheme is proposed for question and answer (Q&A) sites. In particular, we target stack overflow (SO) and stack exchange (SE) sites. We model the problem of author reputation measurement as a classification task on the author-provided answers. The authors' mean, median, and total answer scores are used as inputs for class labeling. State-of-the-art language models (BERT and DistilBERT) with a softmax layer on top are utilized as classifiers and compared to SVM and random baselines. Our best model achieves 63.8% accuracy in binary classification in SO design patterns tag and 71.6% accuracy in SE software engineering category. Superior performance in SE software engineering can be explained by its larger dataset size. In addition to quantitative evaluation, we provide qualitative evidence, which supports that the system's predicted reputation labels match the quality of provided answers.
  • Conference Object
    Citation - WoS: 1
    Citation - Scopus: 1
    A Relativistic Opinion Mining Approach To Detect Factual or Opinionated News Sources
    (Springer Verlag, 2017) Sezerer, Erhan; Tekir, Selma
    The credibility of news cannot be isolated from that of its source. Further, it is mainly associated with a news source’s trustworthiness and expertise. In an effort to measure the trustworthiness of a news source, the factor of “is factual or opinionated” must be considered among others. In this work, we propose an unsupervised probabilistic lexicon-based opinion mining approach to describe a news source as “being factual or opinionated”. We get words’ positive, negative, and objective scores from a sentiment lexicon and normalize these scores through the use of their cumulative distribution. The idea behind the use of such a statistical approach is inspired from the relativism that each word is evaluated with its difference from the average word. In order to test the effectiveness of the approach, three different news sources are chosen. They are editorials, New York Times articles, and Reuters articles, which differ in their characteristic of being opinionated. Thus, the experimental validation is done by the analysis of variance on these different groups of news. The results prove that our technique can distinguish the news articles from these groups with respect to “being factual or opinionated” in a statistically significant way.