Sezerer, Erhan

Sezerer, Erhan

Profile URL

https://hdl.handle.net/11147/16111

Name Variants

Sezerer, E
Sezerer, E.

Main Affiliation

03.04. Department of Computer Engineering

Status

Former Staff

Sustainable Development Goals

1

NO POVERTY

0

Research Products

2

ZERO HUNGER

0

Research Products

3

GOOD HEALTH AND WELL-BEING

0

Research Products

4

QUALITY EDUCATION

1

Research Products

5

GENDER EQUALITY

0

Research Products

6

CLEAN WATER AND SANITATION

0

Research Products

7

AFFORDABLE AND CLEAN ENERGY

0

Research Products

8

DECENT WORK AND ECONOMIC GROWTH

0

Research Products

9

INDUSTRY, INNOVATION AND INFRASTRUCTURE

2

Research Products

10

REDUCED INEQUALITIES

0

Research Products

11

SUSTAINABLE CITIES AND COMMUNITIES

0

Research Products

12

RESPONSIBLE CONSUMPTION AND PRODUCTION

0

Research Products

13

CLIMATE ACTION

0

Research Products

14

LIFE BELOW WATER

0

Research Products

15

LIFE ON LAND

0

Research Products

16

PEACE, JUSTICE AND STRONG INSTITUTIONS

0

Research Products

17

PARTNERSHIPS FOR THE GOALS

0

Research Products

Documents

8

Citations

31

h-index

4

Go to Scopus profile

WoS data could not be loaded because of an error. Please refresh the page or try again later.

No records found in other affiliations.

Scholarly Output

10

Articles

3

Views / Downloads

17563/4227

Supervised MSc Theses

1

Supervised PhD Theses

1

WoS Citation Count

16

Scopus Citation Count

15

Patents

0

Projects

0

WoS Citations per Publication

1.60

Scopus Citations per Publication

1.50

Open Access Source

8

Supervised Theses

2

Journal	Count
13th Linguistic Annotation Workshop (LAW) -- Aug 01, 2019 -- Florence, Italy	1
19th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2018	1
2020 28th Signal Processing and Communications Applications Conference, SIU 2020 - Proceedings	1
27th Signal Processing and Communications Applications Conference, SIU 2019	1
Applied Sciences	1

Page Size:

Current Page: 1 / 2

Scopus Quartile Distribution

Competency Cloud

Scholarly Output Search Results

Now showing 1 - 10 of 10

Citation - Scopus: 6
Gender Prediction From Tweets With Convolutional Neural Networks: Notebook for Pan at Clef 2018
(CEUR Workshop Proceedings, 2018) Sezerer, Erhan; Tekir, Selma; Sevgili, Özge; Tekir, Selma; Sezerer, Erhan; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
This paper presents a system1 developed for the author profiling task of PAN at CLEF 2018. The system utilizes style-based features to predict the gender information from the given tweets of each user. These features are automatically extracted by Convolutional Neural Networks (CNN). The system mainly depends on the idea that the informativeness of each tweet is not the same in terms of the gender of a user. Thus, the attention mechanism is included to the CNN outputs in order to discriminate the tweets carrying more information. Our architecture was able to obtain competitive results on three languages provided by the PAN 2018 author profiling challenge with an average accuracy of 75.1% on local runs and 70.23% on the submission run.
Citation - Scopus: 1
Türkçe Tweetler Üzerinden Yapay Sinir Ağları ile Cinsiyet Tahminlemesi
(Institute of Electrical and Electronics Engineers Inc., 2019) Sezerer, Erhan; Tekir, Selma; Tekir, Selma; Sezerer, Erhan; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Yazar ayrımlaması, yazarı bilinmeyen bir metin üzerinden yazarına dair cinsiyet, yaş ve dil gibi bazı anahtar özniteliklerin belirlenmesidir. Özellikle güvenlik ve pazarlama alanında önem arz etmektedir. Bu çalışmada, kullanıcıların tweetleri kullanılarak cinsiyetleri tahminlenmektedir. Yinelemeli Sinir Ağı (YSA) ve ilgi mekanizmasının birleşiminden oluşan bir model önerilmiştir. Bildiğimiz kadarıyla bu çalışma Twitter veri kümesi ile Türkçe’de ilk defa yapılmıştır. Önerilen model Türkçe, İngilizce, İspanyolca ve Arapça dillerinde sınanmış ve sırasıyla 80.63, 81.73, 78.22, 78.5 doğruluk değerlerine ulaşılmıştır. Elde edilen doğruluk değerleri Türkçe’de en gelişkin, diğer dillerde ise rekabetçi bir başarım ortaya koymaktadır.
Citation - WoS: 10
A Turkish Dataset for Gender Identification of Twitter Users
(Assoc Computational Linguistics-ACL, 2019) Tekir, Selma; Sezerer, Erhan; Tekir, Selma; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Author profiling is the identification of an author's gender, age, and language from his/her texts. With the increasing trend of using Twitter as a means to express thought, profiling the gender of an author from his/her tweets has become a challenge. Although several datasets in different languages have been released on this problem, there is still a need for multilingualism. In this work, we propose a dataset of tweets of Turkish Twitter users which are labeled with their gender information. The dataset has 3368 users in the training set and 1924 users in the test set where each user has 100 tweets. The dataset is publicly available(1).
Citation - WoS: 2
Citation - Scopus: 4
Çok-etiketli Film Türü Sınıflandırması için Türkçe Konu Modellemesi Veri Kümesi
(Institute of Electrical and Electronics Engineers, 2020) Tekir, Selma; Sezerer, Erhan; Para, Hasan; Polatbilek, Ozan; Sezerer, Erhan; Tekir, Selma; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Statistical topic modeling aims to assign topics to documents in an unsupervised way. Latent Dirichlet Allocation (LDA) is the standard model for topic modeling. It shows good performance on document collections, documents being relatively long texts but it has poor performance on short texts. Topic modeling on short texts is on the rise due to the potential of social media. Thus, approaches that are able to nd topics on short texts as well as long texts are sought. However, there is a lack of datasets that include both long and short texts which have the same ground-truth categories. In this work, we release a Turkish movie dataset which contain both short lm descriptions and long subscripts where lm genre can be considered as topic. Furthermore, we provide multi-label movie genre classication results using a Feed Forward Neural Network (FFNN) taking LDA document-topic or Doc2Vec dense representations. © 2020 IEEE.
Citation - WoS: 2
Citation - Scopus: 2
Incorporating Concreteness in Multi-Modal Language Models With Curriculum Learning
(MDPI, 2021) Tekir, Selma; Sezerer, Erhan; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Over the last few years, there has been an increase in the studies that consider experiential (visual) information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans starts with learning concrete concepts through images and then continues with learning abstract ideas through the text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through images and their corresponding captions to accomplish multi-modal language modeling/representation. We use the BERT and Resnet-152 models on each modality and combine them using attentive pooling to perform pre-training on the newly constructed dataset, which is collected from the Wikimedia Commons based on concrete/abstract words. To show the performance of the proposed model, downstream tasks and ablation studies are performed. The contribution of this work is two-fold: A new dataset is constructed from Wikimedia Commons based on concrete/abstract words, and a new multi-modal pre-training approach based on curriculum learning is proposed. The results show that the proposed multi-modal pre-training approach contributes to the success of the model.
Discovering Specific Semantic Relations Among Words Using Neural Network Methods
(Izmir Institute of Technology, 2021) Sezerer, Erhan; Sezerer, Erhan; Tekir, Selma; Tekir, Selma; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Human-level language understanding is one of the oldest challenges in computer science. Many scientific work has been dedicated to finding good representations for semantic units (words, morphemes, characters) in languages. Recently, contextual language models, such as BERT and its variants, showed great success in downstream natural language processing tasks with the use of masked language modelling and transformer structures. Although these methods solve many problems in this domain and are proved to be useful, they still lack one crucial aspect of the language acquisition in humans: Experiential (visual) information. Over the last few years, there has been an increase in the studies that consider experiential information by building multi-modal language models and representations. It is shown by several studies that language acquisition in humans start with learning concrete concepts through images and then continue with learning abstract ideas through text. In this work, the curriculum learning method is used to teach the model concrete/abstract concepts through the use of images and corresponding captions to accomplish the task of multi-modal language modeling/representation. BERT and Resnet-152 model is used on each modality with attentive pooling mechanism on the newly constructed dataset, collected from the Wikimedia Commons. To show the performance of the proposed model, downstream tasks and ablation studies are performed. Contribution of this work is two-fold: a new dataset is constructed from Wikimedia Commons and a new multi-modal pre-training approach that is based on curriculum learning is proposed. Results show that the proposed multi-modal pre-training approach increases the success of the model.
Gender Prediction From Tweets: Improving Neural Representations With Hand-Crafted Features
(Cornell University, 2019) Tekir, Selma; Sezerer, Erhan; Sezerer, Erhan; Tekir, Selma; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn ’where to look’. This model1 is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic.
News Story Analysis With Credibility Assessment by Opinion Mining
(Izmir Institute of Technology, 2015) Sezerer, Erhan; Tekir, Selma; Sezerer, Erhan; Tekir, Selma; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
With the growing influence of media and the popularity and widespread use of social networks, credibility of the news sources became an important subject that needs more attention. The biggest problem of finding credible sources is, instead of giving every aspect of the incident, news sources tend to accept one of the parties’ idea as a whole while rejecting every other ideas, or even worse, they focus on only one side of the incident and ignoring the rest. Credibility is defined as “The quality of believable and trustworthy”. The notion of trustworthiness can further be decomposed into components like bias, fairness, factual/ opinionated, etc. In this thesis, credibility is measured using the fact/opinion ratio of the articles. Two methods, which are the traditional Naive Bayes method and the Relativistic method, are proposed. The intuition of relativistic method comes from the theory of relativity where the sentiment of the articles is determined relatively to the ordinary context used by people in daily speech. We have tested our methods on four different types of data, hand-written articles, editorials, New York Times articles and Reuters articles, and aimed to show that our proposed models are able to differentiate the sentiments in the articles. In the experimental work, we provided a detailed evaluation of the results.
Citation - WoS: 1
Citation - Scopus: 1
Author Reputation Measurement on Question and Answer Sites by the Classification of Author-Generated Content
(World Scientific Publishing, 2021) Sezerer, Erhan; Tenekeci, Samet; Tekir, Selma; Baloğlu, Bora; Tekir, Selma; Acar, Ali; Tenekeci, Samet; Sezerer, Erhan; Baloğlu, Bora; 03.04. Department of Computer Engineering; 01. Izmir Institute of Technology; 03. Faculty of Engineering
In the field of software engineering, practitioners' share in the constructed knowledge cannot be underestimated and is mostly in the form of grey literature (GL). GL is a valuable resource though it is subjective and lacks an objective quality assurance methodology. In this paper, a quality assessment scheme is proposed for question and answer (Q&A) sites. In particular, we target stack overflow (SO) and stack exchange (SE) sites. We model the problem of author reputation measurement as a classification task on the author-provided answers. The authors' mean, median, and total answer scores are used as inputs for class labeling. State-of-the-art language models (BERT and DistilBERT) with a softmax layer on top are utilized as classifiers and compared to SVM and random baselines. Our best model achieves 63.8% accuracy in binary classification in SO design patterns tag and 71.6% accuracy in SE software engineering category. Superior performance in SE software engineering can be explained by its larger dataset size. In addition to quantitative evaluation, we provide qualitative evidence, which supports that the system's predicted reputation labels match the quality of provided answers.
Citation - WoS: 1
Citation - Scopus: 1
A Relativistic Opinion Mining Approach To Detect Factual or Opinionated News Sources
(Springer Verlag, 2017) Sezerer, Erhan; Tekir, Selma; Tekir, Selma; Sezerer, Erhan; 03.04. Department of Computer Engineering; 03. Faculty of Engineering; 01. Izmir Institute of Technology
The credibility of news cannot be isolated from that of its source. Further, it is mainly associated with a news source’s trustworthiness and expertise. In an effort to measure the trustworthiness of a news source, the factor of “is factual or opinionated” must be considered among others. In this work, we propose an unsupervised probabilistic lexicon-based opinion mining approach to describe a news source as “being factual or opinionated”. We get words’ positive, negative, and objective scores from a sentiment lexicon and normalize these scores through the use of their cumulative distribution. The idea behind the use of such a statistical approach is inspired from the relativism that each word is evaluated with its difference from the average word. In order to test the effectiveness of the approach, three different news sources are chosen. They are editorials, New York Times articles, and Reuters articles, which differ in their characteristic of being opinionated. Thus, the experimental validation is done by the analysis of variance on these different groups of news. The results prove that our technique can distinguish the news articles from these groups with respect to “being factual or opinionated” in a statistically significant way.

Sezerer, Erhan

Profile URL

Name Variants

Job Title

Email Address

Main Affiliation

Status

Website

ORCID ID

Scopus Author ID

Turkish CoHE Profile ID

Google Scholar ID

WoS Researcher ID

Sustainable Development Goals

NO POVERTY

ZERO HUNGER

GOOD HEALTH AND WELL-BEING

QUALITY EDUCATION

GENDER EQUALITY

CLEAN WATER AND SANITATION

AFFORDABLE AND CLEAN ENERGY

DECENT WORK AND ECONOMIC GROWTH

INDUSTRY, INNOVATION AND INFRASTRUCTURE

REDUCED INEQUALITIES

SUSTAINABLE CITIES AND COMMUNITIES

RESPONSIBLE CONSUMPTION AND PRODUCTION

CLIMATE ACTION

LIFE BELOW WATER

LIFE ON LAND

PEACE, JUSTICE AND STRONG INSTITUTIONS

PARTNERSHIPS FOR THE GOALS

Documents

8

Citations

31

h-index

4

WoS data could not be loaded because of an error. Please refresh the page or try again later.

No records found in other affiliations.

Scholarly Output

10

Articles

3

Views / Downloads

17563/4227

Supervised MSc Theses

1

Supervised PhD Theses

1

WoS Citation Count

16

Scopus Citation Count

15

Patents

0

Projects

0

WoS Citations per Publication

1.60

Scopus Citations per Publication

1.50

Open Access Source

8

Supervised Theses

2

Scopus Quartile Distribution

Competency Cloud

Filters

Settings

Sort By

Results per page

Scholarly Output Search Results