Computer Engineering / Bilgisayar Mühendisliği

Permanent URI for this collectionhttps://hdl.handle.net/11147/10

Browse

Search Results

Now showing 1 - 3 of 3
  • Conference Object
    Citation - WoS: 1
    Artist Recommendation Based on Association Rule Mining and Community Detection
    (SCITEPRESS, 2021) Çiftçi, Okan; Tenekeci, Samet; Ülgentürk, Ceren
    Recent advances in the web have greatly increased the accessibility of music streaming platforms and the amount of consumable audio content. This has made automated recommendation systems a necessity for listeners and streaming platforms alike. Therefore, a wide variety of predictive models have been designed to identify related artists and music collections. In this paper, we proposed a graph-based approach that utilizes association rules extracted from Spotify playlists. We constructed several artist networks and identified related artist clusters using Louvain and Label Propagation community detection algorithms. We analyzed internal and external cluster agreements based on different validation criteria. As a result, we achieved up to 99.38% internal and 90.53% external agreements between our models and Spotify's related artist lists. These results show that integrating association rule mining concepts with graph databases can be a novel and effective way to design an artist recommendation system.
  • Conference Object
    Citation - Scopus: 2
    Comparison of Dynamic Itemset Mining Algorithms for Multiple Support Thresholds
    (Association for Computing Machinery (ACM), 2017) Abuzayed, Nourhan; Ergenç, Belgin
    Mining1 frequent itemsets is an important part of association rule mining process. Handling dynamic aspect of databases and multiple support threshold requirements of items are two important challenges of frequent itemset mining algorithms. Most of the existing dynamic itemset mining algorithms are devised for single support threshold whereas multiple support threshold algorithms are static. This work focuses on dynamic update problem of frequent itemsets under multiple support thresholds and proposes tree-based Dynamic CFP-Growth++ algorithm. Proposed algorithm is compared to our previous dynamic algorithm Dynamic MIS [50] and a recent static algorithm CFP-Growth++ [2] and, findings are; in dynamic database, 1) both of the dynamic algorithms are better than the static algorithm CFP-Growth++, 2) as memory usage performance; Dynamic CFP-Growth++ performs better than Dynamic MIS, 3) as execution time performance; Dynamic MIS is better than Dynamic CFP-Growth++. In short, Dynamic CFP-Growth++ and Dynamic MIS have a trade-off relationship in terms of memory usage and execution time.
  • Conference Object
    Citation - Scopus: 16
    Comparison of Two Association Rule Mining Algorithms Without Candidate Generation
    (ACTA Press, 2010) Yıldız, Barış; Ergenç, Belgin
    Association rule mining techniques play an important role in data mining research where the aim is to find interesting correlations among sets of items in databases. Although the Apriori algorithm of association rule mining is the one that boosted data mining research, it has a bottleneck in its candidate generation phase that requires multiple passes over the source data. FP-Growth and Matrix Apriori are two algorithms that overcome that bottleneck by keeping the frequent itemsets in compact data structures, eliminating the need of candidate generation. To our knowledge, there is no work to compare those two similar algorithms focusing on their performances in different phases of execution. In this study, we compare Matrix Apriori and FP-Growth algorithms. Two case studies analyzing the algorithms are carried out phase by phase using two synthetic datasets generated in order i) to see their performance with datasets having different characteristics, ii) to understand the causes of performance differences in different phases. Our findings are i) performances of algorithms are related to the characteristics of the given dataset and threshold value, ii) Matrix Apriori outperforms FP-Growth in total performance for threshold values below 10%, iii) although building matrix data structure has higher cost, finding itemsets is faster.