Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection
Permanent URI for this collectionhttps://hdl.handle.net/11147/7148
Browse
3 results
Search Results
Article Citation - WoS: 1Citation - Scopus: 2Dynamic Itemset Hiding Algorithm for Multiple Sensitive Support Thresholds(IGI Global, 2018) Öztürk, Ahmet Cumhur; Ergenç, BelginThis article describes how association rule mining is used for extracting relations between items in transactional databases and is beneficial for decision-making. However, association rule mining can pose a threat to the privacy of the knowledge when the data is shared without hiding the confidential association rules of the data owner. One of the ways hiding an association rule from the database is to conceal the itemsets (co-occurring items) from which the sensitive association rules are generated. These sensitive itemsets are sanitized by the itemset hiding processes. Most of the existing solutions consider single support thresholds and assume that the databases are static, which is not true in real life. In this article, the authors propose a novel itemset hiding algorithm designed for the dynamic database environment and consider multiple itemset support thresholds. Performance comparisons of the algorithm is done with two dynamic algorithms on six different databases. Findings show that their dynamic algorithm is more efficient in terms of execution time and information loss and guarantees to hide all sensitive itemsets.Conference Object Citation - Scopus: 4Itemset Hiding Under Multiple Sensitive Support Thresholds(SCITEPRESS, 2017) Öztürk, Ahmet Cumhur; Ergenç Bostanoğlu, BelginItemset mining is the challenging step of association rule mining that aims to extract patterns among items from transactional databases. In the case of applying itemset mining on the shared data of organizations, each party needs to hide its sensitive knowledge before extracting global knowledge for mutual benefit. Ensuring the privacy of the sensitive itemsets is not the only challenge in the itemset hiding process, also the distortion given to the non-sensitive knowledge and data should be kept at minimum. Most of the previous works related to itemset hiding allow database owner to assign unique sensitive threshold for each sensitive itemset however itemsets may have different count and utility. In this paper we propose a new heuristic based hiding algorithm which 1) allows database owner to assign multiple sensitive threshold values for sensitive itemsets, 2) hides all user defined sensitive itemsets, 3) uses heuristics that minimizes loss of information and distortion on the shared database. In order to speed up hiding steps we represent the database as Pseudo Graph and perform scan operations on this data structure rather than the actual database. Performance evaluation of our algorithm Pseudo Graph Based Sanitization (PGBS) is conducted on 4 real databases. Distortion given to the nonsensitive itemsets (information loss), distortion given to the shared data (distance) and execution time in comparison to three similar algorithms is measured. Experimental results show that PGBS is competitive in terms of execution time and distortion and achieves reasonable performance in terms of information loss amongst the other algorithms. © 2017 by SCITEPRESS - Science and Technology Publications, Lda. All rights reserved.Article Citation - WoS: 3Citation - Scopus: 4Full-Exact Approach for Frequent Itemset Hiding(IGI Global Publishing, 2015) Ayav, Tolga; Ergenç, BelginThis paper proposes a novel, exact approach that relies on integer programming for association rule hiding. A large panorama of solutions exists for the complex problem of itemset hiding: from practical heuristic approaches to more accurate exact approaches. Exact approaches provide better solutions while suffering from the lack of performance and existing exact approaches still augment their methods with heuristics to make the problem solvable. In this case, the solution may not be optimum. This work present a full-exact method, without any need for heuristics. Extensive tests are conducted on 10 real datasets to analyze distance and information loss performances of the algorithm in comparison to a former similar algorithm. Since the approach provides the optimum solution to the problem, it should be considered as a reference method.
