Author Reputation Measurement on Question and Answer Sites by the Classification of Author-Generated Content
Loading...
Date
Authors
Sezerer, Erhan
Tenekeci, Samet
Acar, Ali
Baloğlu, Bora
Tekir, Selma
Journal Title
Journal ISSN
Volume Title
Publisher
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
In the field of software engineering, practitioners' share in the constructed knowledge cannot be underestimated and is mostly in the form of grey literature (GL). GL is a valuable resource though it is subjective and lacks an objective quality assurance methodology. In this paper, a quality assessment scheme is proposed for question and answer (Q&A) sites. In particular, we target stack overflow (SO) and stack exchange (SE) sites. We model the problem of author reputation measurement as a classification task on the author-provided answers. The authors' mean, median, and total answer scores are used as inputs for class labeling. State-of-the-art language models (BERT and DistilBERT) with a softmax layer on top are utilized as classifiers and compared to SVM and random baselines. Our best model achieves 63.8% accuracy in binary classification in SO design patterns tag and 71.6% accuracy in SE software engineering category. Superior performance in SE software engineering can be explained by its larger dataset size. In addition to quantitative evaluation, we provide qualitative evidence, which supports that the system's predicted reputation labels match the quality of provided answers.
Description
Keywords
Author reputation measurement, Question and answer sites, Stack exchange, Stack overflow, Text classification
Fields of Science
0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology
Citation
WoS Q
Scopus Q

OpenCitations Citation Count
1
Volume
31
Issue
10
Start Page
1421
End Page
1445
PlumX Metrics
Citations
Scopus : 1
Captures
Mendeley Readers : 3
Google Scholar™


