Çok-etiketli Film Türü Sınıflandırması için Türkçe Konu Modellemesi Veri Kümesi

dc.contributor.author Jabrayilzade, Elgün
dc.contributor.author Poyraz Arslan, Algın
dc.contributor.author Para, Hasan
dc.contributor.author Polatbilek, Ozan
dc.contributor.author Sezerer, Erhan
dc.contributor.author Tekir, Selma
dc.date.accessioned 2021-11-06T09:27:14Z
dc.date.available 2021-11-06T09:27:14Z
dc.date.issued 2020
dc.description 28th Signal Processing and Communications Applications Conference, SIU 2020 -- 5 October 2020 through 7 October 2020 en_US
dc.description.abstract Statistical topic modeling aims to assign topics to documents in an unsupervised way. Latent Dirichlet Allocation (LDA) is the standard model for topic modeling. It shows good performance on document collections, documents being relatively long texts but it has poor performance on short texts. Topic modeling on short texts is on the rise due to the potential of social media. Thus, approaches that are able to nd topics on short texts as well as long texts are sought. However, there is a lack of datasets that include both long and short texts which have the same ground-truth categories. In this work, we release a Turkish movie dataset which contain both short lm descriptions and long subscripts where lm genre can be considered as topic. Furthermore, we provide multi-label movie genre classication results using a Feed Forward Neural Network (FFNN) taking LDA document-topic or Doc2Vec dense representations. © 2020 IEEE. en_US
dc.identifier.doi 10.1109/SIU49456.2020.9302027
dc.identifier.isbn 9781728172064
dc.identifier.scopus 2-s2.0-85100310802
dc.identifier.uri http://doi.org/10.1109/SIU49456.2020.9302027
dc.identifier.uri https://hdl.handle.net/11147/11267
dc.language.iso tr en_US
dc.publisher Institute of Electrical and Electronics Engineers en_US
dc.relation.ispartof 2020 28th Signal Processing and Communications Applications Conference, SIU 2020 - Proceedings en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Doc2Vec en_US
dc.subject Feed-forward neural networks en_US
dc.subject LDA en_US
dc.subject Long text classication en_US
dc.subject Short text classication en_US
dc.subject Text classication dataset en_US
dc.title Çok-etiketli Film Türü Sınıflandırması için Türkçe Konu Modellemesi Veri Kümesi en_US
dc.title.alternative A Turkish Topic Modeling Dataset for Multi-Label Classification of Movie Genre en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access open access
gdc.coar.type text::conference output
gdc.collaboration.industrial false
gdc.description.department İzmir Institute of Technology. Computer Engineering en_US
gdc.description.endpage 5
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.startpage 1
gdc.identifier.openalex W3120727248
gdc.identifier.wos WOS:000653136100001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 1.0
gdc.oaire.influence 2.7321216E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 3.548593E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.fwci 0.14685955
gdc.openalex.normalizedpercentile 0.58
gdc.opencitations.count 3
gdc.plumx.crossrefcites 1
gdc.plumx.mendeley 7
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.wos.citedcount 2
relation.isAuthorOfPublication.latestForDiscovery 57639474-3954-4f77-a84c-db8a079648a8
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4014-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
A_Turkish_Topic.pdf
Size:
223.29 KB
Format:
Adobe Portable Document Format