Master Degree / Yüksek Lisans Tezleri
Permanent URI for this collectionhttps://hdl.handle.net/11147/3008
Browse
2 results
Search Results
Now showing 1 - 2 of 2
Master Thesis Develepment of Framework for Frequent Itemset Mining Under Multiple Support Thresholds(Izmir Institute of Technology, 2016) Darrab, Sadeq Hussein Saleh; Ergenç Bostanoğlu, BelginFrequent pattern mining is an essential method of data mining that is used to extract interesting patterns from massive databases. Traditional methods use single minimum support threshold to find out the complete set of frequent patterns. However, in real word applications, using single minimum support threshold is not adequate since it does not reflect the nature of each item and causes a problem called rare item problem. Recently, several methods have been studied to tackle this problem by avoiding using single minimum item support threshold. The nature of each item is considered where different items are specified with different minimum support thresholds. By this, the complete set of frequent patters are generated without creating uninteresting patterns and losing substantial patterns. In this thesis, we propose an efficient method, Multiple Item Support Frequent Pattern growth algorithm, MISFP-growth, to mine the complete set of frequent patterns with multiple item support thresholds. In this method, Multiple Item Support Frequent Pattern tree, MISFP-Tree, is constructed to store all crucial information to mine frequent patterns. Since in the construction of the MISFP-Tree is done with respect to minimum of Multiple Itemset Support values; pruning and reconstruction phases are not required. To show the efficiency of the proposed method, it is compared with a recent tree-based algorithm, CFP-growth++. To evaluate the performance of the proposed algorithm, various experiments are conducted on both real and synthetic datasets. Experimental results reveal that MISFP-growth outperforms the previous algorithm in terms of execution time, memory space as well as scalability.Master Thesis Development of an Application for Dynamic Itemset Mining Under Multiple Support Thresholds(Izmir Institute of Technology, 2016) Abuzayed, Nourhan; Ergenç Bostanoğlu, BelginHandling dynamic aspect of databases and multiple support threshold requirement of items are two important challenges of frequent itemset mining algorithms. Frequent itemsets should be updated when the database is updated without re-running the mining algorithm. Frequent itemset mining algorithm should consider different support thresholds in order not to cause rare item problem. Existing dynamic itemset mining algorithms are devised for single support threshold whereas multiple support threshold algorithms are static. This thesis focuses on dynamic update problem of frequent itemsets under multiple support thresholds and introduces Dynamic MIS1 and Dynamic MIS2 algorithms. They are i) tree based and scan the database once, ii) consider multiple support thresholds, and iii) handle increments of additions, additions with new items and deletions. Proposed algorithms are compared to CFP-Growth++ and findings are; in static databases 1) Dynamic MIS1 achieves up to 5 times speed-up against CFP-Growth++ since it does not require tree pruning and merging, 2) execution time of Dynamic MIS2 and CFP-Growth++ are similar, 3) memory usage of Dynamic MIS1 is higher than CFP-Growth++, since it keeps whole tree in memory, in dynamic database 1) Dynamic MIS1 and Dynamic MIS2 perform better than CFP-Growth++ since they run only on increments, 2) Dynamic MIS1 can achieve speed-up of 56 times against CFP-Growth++, whereas the speed-up of Dynamic MIS2 cannot exceed 2 times, 3) Dynamic MIS2 is slightly better than CFP-Growth++ until increment size is less than 85% when the database is large and sparse, 25% when the database is small and dense.
