Computer Engineering / Bilgisayar Mühendisliği

Permanent URI for this collectionhttps://hdl.handle.net/11147/10

Browse

Search Results

Now showing 1 - 10 of 14

Citation - Scopus: 3
Integrated Approach for Privacy Preserving Itemset Mining
(Springer, 2012) Yıldız, Barış; Ergenç, Belgin
In this work, we propose an integrated itemset hiding algorithm that eliminates the need of pre-mining and post-mining and uses a simple heuristic in selecting the itemset and the item in itemset for distortion. Base algorithm (matrix-apriori) works without candidate generation so efficiency is increased. Performance evaluation demonstrates (1) the side effect (lost itemsets) and time while increasing the number of sensitive itemsets and support of itemset and (2) speed up by integrating the post mining. © 2012 Springer Science+Business Media, LLC.
Citation - WoS: 1
Citation - Scopus: 2
Dynamic Itemset Hiding Algorithm for Multiple Sensitive Support Thresholds
(IGI Global, 2018) Öztürk, Ahmet Cumhur; Ergenç, Belgin
This article describes how association rule mining is used for extracting relations between items in transactional databases and is beneficial for decision-making. However, association rule mining can pose a threat to the privacy of the knowledge when the data is shared without hiding the confidential association rules of the data owner. One of the ways hiding an association rule from the database is to conceal the itemsets (co-occurring items) from which the sensitive association rules are generated. These sensitive itemsets are sanitized by the itemset hiding processes. Most of the existing solutions consider single support thresholds and assume that the databases are static, which is not true in real life. In this article, the authors propose a novel itemset hiding algorithm designed for the dynamic database environment and consider multiple itemset support thresholds. Performance comparisons of the algorithm is done with two dynamic algorithms on six different databases. Findings show that their dynamic algorithm is more efficient in terms of execution time and information loss and guarantees to hide all sensitive itemsets.
Citation - Scopus: 12
Incremental Itemset Mining Based on Matrix Apriori Algorithm
(Springer Verlag, 2012) Oğuz, Damla; Ergenç, Belgin
Databases are updated continuously with increments and re-running the frequent itemset mining algorithms with every update is inefficient. Studies addressing incremental update problem generally propose incremental itemset mining methods based on Apriori and FP-Growth algorithms. Besides inheriting the disadvantages of base algorithms, incremental itemset mining has challenges such as handling i) increments without re-running the algorithm, ii) support changes, iii) new items and iv) addition/deletions in increments. In this paper, we focus on the solution of incremental update problem by proposing the Incremental Matrix Apriori Algorithm. It scans only new transactions, allows the change of minimum support and handles new items in the increments. The base algorithm Matrix Apriori works without candidate generation, scans database only twice and brings additional advantages. Performance studies show that Incremental Matrix Apriori provides speed-up between 41% and 92% while increment size is varied between 5% and 100%.
Citation - WoS: 1
Citation - Scopus: 1
Extended Adaptive Join Operator With Bind-Bloom Join for Federated Sparql Queries
(IGI Global Publishing, 2017) Oğuz, Damla; Yin, Shaoyi; Ergenç, Belgin; Hameurlain, Abdelkader; Dikenelli, Oğuz
The goal of query optimization in query federation over linked data is to minimize the response time and the completion time. Communication time has the highest impact on them both. Static query optimization can end up with inefficient execution plans due to unpredictable data arrival rates and missing statistics. This study is an extension of adaptive join operator which always begins with symmetric hash join to minimize the response time, and can change the join method to bind join to minimize the completion time. The authors extend adaptive join operator with bind-bloom join to further reduce the communication time and, consequently, to minimize the completion time. They compare the new operator with symmetric hash join, bind join, bind-bloom join, and adaptive join operator with respect to the response time and the completion time. Performance evaluation shows that the extended operator provides optimal response time and further reduces the completion time. Moreover, it has the adaptation ability to different data arrival rates.
Citation - Scopus: 2
Adaptive Join Operator for Federated Queries Over Linked Data Endpoints
(Springer Verlag, 2016) Oğuz, Damla; Yin, Shaoyi; Hameurlain, Abdelkader; Ergenç, Belgin; Dikenelli, Oğuz
Traditional static query optimization is not adequate for query federation over linked data endpoints due to unpredictable data arrival rates and missing statistics. In this paper, we propose an adaptive join operator for federated query processing which can change the join method during the execution. Our approach always begins with symmetric hash join in order to produce the first result tuple as soon as possible and changes the join method as bind join when it estimates that bind join is more efficient than symmetric hash join for the rest of the process. We compare our approach with symmetric hash join and bind join. Performance evaluation shows that our approach provides optimal response time and has the adaptation ability to the different data arrival rates.
Citation - WoS: 7
Citation - Scopus: 20
Vertical Pattern Mining Algorithm for Multiple Support Thresholds
(Elsevier Ltd., 2017) Darrab, Sadeq; Ergenç Bostanoğlu, Belgin; Ergenç, Belgin
Frequent pattern mining is an important task in discovering hidden items that co-occur (itemset) more than a predefined threshold in a database. Mining frequent itemsets has drawn attention although rarely occurring ones might have more interesting insights. In existing studies, to find these interesting patterns (rare itemsets), user defined single threshold should be set low enough but this results in generation of huge amount of redundant itemsets. We present Multiple Item Support-eclat; MIS-eclat algorithm, to mine frequent patterns including rare itemsets under multiple support thresholds (MIS) by utilizing a vertical representation of data. We compare MIS-eclat to our previous tree based algorithm, MISFP-growth28 and another recent algorithm, CFP-growth++22 in terms of execution time, memory usage and scalability on both sparse and dense databases. Experimental results reveal that MIS-eclat and MISFP-growth outperform CFP-growth++ in terms of execution time, memory usage and scalability.
Citation - Scopus: 2
Comparison of Dynamic Itemset Mining Algorithms for Multiple Support Thresholds
(Association for Computing Machinery (ACM), 2017) Abuzayed, Nourhan; Ergenç, Belgin
Mining1 frequent itemsets is an important part of association rule mining process. Handling dynamic aspect of databases and multiple support threshold requirements of items are two important challenges of frequent itemset mining algorithms. Most of the existing dynamic itemset mining algorithms are devised for single support threshold whereas multiple support threshold algorithms are static. This work focuses on dynamic update problem of frequent itemsets under multiple support thresholds and proposes tree-based Dynamic CFP-Growth++ algorithm. Proposed algorithm is compared to our previous dynamic algorithm Dynamic MIS [50] and a recent static algorithm CFP-Growth++ [2] and, findings are; in dynamic database, 1) both of the dynamic algorithms are better than the static algorithm CFP-Growth++, 2) as memory usage performance; Dynamic CFP-Growth++ performs better than Dynamic MIS, 3) as execution time performance; Dynamic MIS is better than Dynamic CFP-Growth++. In short, Dynamic CFP-Growth++ and Dynamic MIS have a trade-off relationship in terms of memory usage and execution time.
Citation - WoS: 17
Citation - Scopus: 26
Federated Query Processing on Linked Data: a Qualitative Survey and Open Challenges
(Cambridge University Press, 2015) Oğuz, Damla; Ergenç, Belgin; Yin, Shaoyi; Dikenelli, Oğuz; Hameurlain, Abdelkader
A large number of data providers publish and connect their structured data on the Web as linked data. Thus, the Web of data becomes a global data space. In this paper, we initially give an overview of query processing approaches used in this interlinked and distributed environment, and then focus on federated query processing on linked data. We provide a detailed and clear insight on data source selection, join methods and query optimization methods of existing query federation engines. Furthermore, we present a qualitative comparison of these engines and give a complementary comparison of the measured metrics of each engine with the idea of pointing out the major strengths of each one. Finally, we discuss the major challenges of federated query processing on linked data. © 2015 Cambridge University Press.
Citation - Scopus: 7
Orderbased Labeling Scheme for Dynamic Xml Query Processing
(Springer Verlag, 2012) Assefa, Beakal Gizachew; Ergenç, Belgin
Need for robust and high performance XML database systems increased due to growing XML data produced by today's applications. Like indexes in relational databases, XML labeling is the key to XML querying. Assigning unique labels to nodes of a dynamic XML tree in which the labels encode all structural relationships between the nodes is a challenging problem. Early labeling schemes designed for static XML document generate short labels; however, their performance degrades in update intensive environments due to the need for relabeling. On the other hand, dynamic labeling schemes achieve dynamicity at the cost of large label size or complexity which results in poor query performance. This paper presents OrderBased labeling scheme which is dynamic, simple and compact yet able to identify structural relationships among nodes. A set of performance tests show promising labeling, querying, update performance and optimum label size. © 2012 IFIP International Federation for Information Processing.
Citation - Scopus: 2
Hiding Sensitive Predictive Frequent Itemsets
(International Association of Engineers, 2011) Yıldız, Barış; Ergenç, Belgin
In this work, we propose an itemset hiding algorithm with four versions that use different heuristics in selecting the item in itemset and the transaction for distortion. The main strengths of itemset hiding algorithm can be stated as i) it works without pre-mining so privacy breech caused by revealing frequent itemsets in advance is prevented and efficiency is increased, ii) base algorithm (Matrix-Apriori) works without candidate generation so efficiency is increased, iii) sanitized database and frequent itemsets of this database are given as outputs so no post-mining is required and iv) simple heuristics like the length of the pattern and the frequency of the item in the pattern are used for selecting the item for distortion. We compare versions of our itemset hiding algorithm by their side effects, runtimes and distortion on original database.

Computer Engineering / Bilgisayar Mühendisliği

Browse

Filters

Settings

Sort By

Results per page

Search Results