Hybrid Data Publishing Based on Differential Privacy

Authors

  • Tao Wang State Grid Xinjiang Electric Power Co., Ltd. Information and Communication Company, Urumqi, Xinjiang, 832000, China. Xinjiang Energy Internet Big Data Laboratory, Urumqi, Xinjiang, 832000, China
  • Kaining Sun {Xinjiang Energy Internet Big Data Laboratory,Urumqi, Xinjiang,832000, China. State Grid Xinjiang Electric Power Co., Ltd, Urumqi, Xinjiang, 832000, China
  • Rui Yin State Grid Xinjiang Electric Power Co., Ltd. Information and Communication Company, Urumqi, Xinjiang, 832000, China. Xinjiang Energy Internet Big Data Laboratory, Urumqi, Xinjiang, 832000, China
  • Teng Zhang State Grid Xinjiang Electric Power Co., Ltd. Information and Communication Company, Urumqi, Xinjiang, 832000, China. Xinjiang Energy Internet Big Data Laboratory, Urumqi, Xinjiang, 832000, China
  • Longjun Zhang State Grid Xinjiang Electric Power Co., Ltd. Information and Communication Company, Urumqi, Xinjiang, 832000, China. Xinjiang Energy Internet Big Data Laboratory, Urumqi, Xinjiang, 832000, China

DOI:

https://doi.org/10.12694/scpe.v26i2.3958

Keywords:

Differential privacy; Mixed data; Information; Clustering

Abstract

The advent of the information and intelligence era has led to explosive growth of data. The author proposes a hybrid data model based on differential privacy. The main content of this model is based on the study of differential privacy, processing the data through a noise mechanism, using the calculation of tuple attribute differences and noise addition, and finally constructing a mixed data model based on differential privacy through experiments. The experimental results indicate that: as the value of k increases, the clustering results tend to be optimal, verifying that clustering the original data can reduce noise addition. However, ICMD-DP anonymizes the original dataset, resulting in much higher information loss than DCKPDP and prototype algorithms.  A mixed data model based on differential privacy enables better clustering performance of the original dataset, thereby utilizing differential privacy to better protect the data.

Downloads

Published

2025-02-10

Issue

Section

Special Issue - High-performance Computing Algorithms for Material Sciences