Dgstream: High Quality and Efficiency Stream Clustering Algorithm

dc.contributor.author Ahmed, Rowanda
dc.contributor.author Dalkılıç, Gökhan
dc.contributor.author Erten, Yusuf
dc.coverage.doi 10.1016/j.eswa.2019.112947
dc.date.accessioned 2020-07-18T08:34:04Z
dc.date.available 2020-07-18T08:34:04Z
dc.date.issued 2020
dc.description.abstract Recently as applications produce overwhelming data streams, the need for strategies to analyze and cluster streaming data becomes an urgent and a crucial research area for knowledge discovery. The main objective and the key aim of data stream clustering is to gain insights into incoming data. Recognizing all probable patterns in this boundless data which arrives at varying speeds and structure and evolves over time, is very important in this analysis process. The existing data stream clustering strategies so far, all suffer from different limitations, like the inability to find the arbitrary shaped clusters and handling outliers in addition to requiring some parameter information for data processing. For fast, accurate, efficient and effective handling for all these challenges, we proposed DGStream, a new online-offline grid and density-based stream clustering algorithm. We conducted many experiments and evaluated the performance of DGStream over different simulated databases and for different parameter settings where a wide variety of concept drifts, novelty, evolving data, number and size of clusters and outlier detection are considered. Our algorithm is suitable for applications where the interest lies in the most recent information like stock market, or if the analysis of existing information is required as well as cases where both the old and the recent information are all equally important. The experiments, over the synthetic and real datasets, show that our proposed algorithm outperforms the other algorithms in efficiency. (C) 2019 Elsevier Ltd. All rights reserved. en_US
dc.identifier.doi 10.1016/j.eswa.2019.112947
dc.identifier.issn 0957-4174
dc.identifier.issn 1873-6793
dc.identifier.scopus 2-s2.0-85072608306
dc.identifier.uri https://doi.org/10.1016/j.eswa.2019.112947
dc.identifier.uri https://hdl.handle.net/11147/8872
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.relation.ispartof Expert Systems with Applications en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject Data streams architectures en_US
dc.subject Data stream mining en_US
dc.subject Grid-based clustering en_US
dc.subject Density-based clustering en_US
dc.subject Online clustering en_US
dc.title Dgstream: High Quality and Efficiency Stream Clustering Algorithm en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.institutional Ahmed, Rowanda
gdc.author.institutional Erten, Yusuf
gdc.bip.impulseclass C4
gdc.bip.influenceclass C4
gdc.bip.popularityclass C4
gdc.coar.access open access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department İzmir Institute of Technology. Computer Engineering en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q1
gdc.description.volume 141 en_US
gdc.description.wosquality Q1
gdc.identifier.openalex W2972800883
gdc.identifier.wos WOS:000496334800028
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 13.0
gdc.oaire.influence 5.0742464E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 1.7078632E-8
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration National
gdc.openalex.fwci 1.53617746
gdc.openalex.normalizedpercentile 0.87
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 15
gdc.plumx.crossrefcites 21
gdc.plumx.mendeley 50
gdc.plumx.scopuscites 22
gdc.scopus.citedcount 22
gdc.wos.citedcount 16
relation.isAuthorOfPublication.latestForDiscovery e2a193e6-84b8-4ea0-bee1-71ccc62559b3
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4014-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Name:
1-s2.0-S0957417419306657-main.pdf
Size:
2.19 MB
Format:
Adobe Portable Document Format