Çok-etiketli Film Türü Sınıflandırması için Türkçe Konu Modellemesi Veri Kümesi
Loading...
Date
2020
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers
Open Access Color
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Statistical topic modeling aims to assign topics to documents in an unsupervised way. Latent Dirichlet Allocation (LDA) is the standard model for topic modeling. It shows good performance on document collections, documents being relatively long texts but it has poor performance on short texts. Topic modeling on short texts is on the rise due to the potential of social media. Thus, approaches that are able to nd topics on short texts as well as long texts are sought. However, there is a lack of datasets that include both long and short texts which have the same ground-truth categories. In this work, we release a Turkish movie dataset which contain both short lm descriptions and long subscripts where lm genre can be considered as topic. Furthermore, we provide multi-label movie genre classication results using a Feed Forward Neural Network (FFNN) taking LDA document-topic or Doc2Vec dense representations. © 2020 IEEE.
Description
28th Signal Processing and Communications Applications Conference, SIU 2020 -- 5 October 2020 through 7 October 2020
Keywords
Doc2Vec, Feed-forward neural networks, LDA, Long text classication, Short text classication, Text classication dataset
Fields of Science
0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology
Citation
WoS Q
Scopus Q

OpenCitations Citation Count
3
Source
2020 28th Signal Processing and Communications Applications Conference, SIU 2020 - Proceedings
Volume
Issue
Start Page
1
End Page
5
PlumX Metrics
Citations
CrossRef : 1
Scopus : 4
Captures
Mendeley Readers : 7
SCOPUS™ Citations
4
checked on Apr 27, 2026
Web of Science™ Citations
2
checked on Apr 27, 2026
Page Views
2746
checked on Apr 27, 2026
Downloads
470
checked on Apr 27, 2026
Google Scholar™


