Minimizing Information Loss in Shared Data: Hiding Frequent Patterns With Multiple Sensitive Support Thresholds

Bostanoğlu, Belgin Ergenç; Ergenç Bostanoğlu, Belgin; Öztürk, Ahmet Cumhur

doi:10.1002/sam.11458

Minimizing Information Loss in Shared Data: Hiding Frequent Patterns With Multiple Sensitive Support Thresholds

Files

Statistical Analysis.pdf (4.41 MB)

Date

2020

Authors

Bostanoğlu, Belgin Ergenç

Ergenç Bostanoğlu, Belgin

Öztürk, Ahmet Cumhur

Publisher

Wiley

Green Open Access

No

Publicly Funded

No

Impulse

Average

Influence

Average

Popularity

Average

Abstract

Privacy preserving data mining (PPDM) is the process of protecting sensitive knowledge from being discovered by data mining techniques in case of data sharing. Privacy preserving frequent itemset mining (PPFIM) is a subtask and NP-hard problem of PPDM. Its objective is to modify a given database in such a way that none of the sensitive itemsets of the database owner can be obtained by any frequent itemset mining technique from the modified database. The main challenge of PPFIM is to minimize the distortion given to the data and nonsensitive knowledge while sanitizing all given sensitive itemsets. Distortion-based sensitive itemset hiding algorithms decrease the support of each sensitive itemset under a predefined sensitive threshold through sanitization. Most of the distortion-based itemset hiding algorithms allow database owner to define a single sensitive threshold for each sensitive itemset. However, this is a limitation to the database owner since the importance of each sensitive itemset varies. In this paper we propose a distortion-based itemset hiding algorithm that allows database owner to assign multiple sensitive thresholds, namely itemset oriented pseudo graph based sanitization (IPGBS) algorithm. The purpose of IPGBS algorithm is to give minimum distortion to the nonsensitive knowledge and data while hiding all sensitive itemsets. For this reason, the IPGBS algorithm modifies least amount of transaction and transaction content. The performance evaluation of the IPGBS algorithm is conducted by using two different counterparts on four different databases. The results show that the IPGBS algorithm is more efficient in terms of nonsensitive frequent itemset loss on both dense and sparse databases. It has considerable good results in terms of number of transactions modified, number of items deleted, execution time and total memory allocation as well.

Keywords

Information loss, Itemset mining, Privacy preserving itemset mining

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

WoS Q

Q1

Scopus Q

Q1

OpenCitations Citation Count

2

Source

Statistical Analysis and Data Mining

Volume

13

Issue

4

Start Page

309

End Page

323

URI

https://doi.org/10.1002/sam.11458
https://hdl.handle.net/11147/8831

Collections

WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection
Computer Engineering / Bilgisayar Mühendisliği
Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection

PlumX Metrics

Citations

CrossRef : 2

Scopus : 1

Captures

Mendeley Readers : 6

Full item page

SCOPUS™ Citations

1

checked on Jun 12, 2026

Web of Science™ Citations

1

checked on Jun 12, 2026

Page Views

827

checked on Jun 12, 2026

Downloads

320

checked on Jun 12, 2026

Google Scholar™

Check

Minimizing Information Loss in Shared Data: Hiding Frequent Patterns With Multiple Sensitive Support Thresholds

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

BIP! Indicators

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

Description

Keywords

Fields of Science

Citation

WoS Q

Scopus Q

OpenCitations Citation Count

Source

Volume

Issue

Start Page

End Page

URI

Collections

PlumX Metrics

Citations

Captures

SCOPUS™ Citations

1

Web of Science™ Citations

1

Page Views

827

Downloads

320

Google Scholar™

OpenAlex FWCI

0.2937191

Sustainable Development Goals

SDG data is not available