期刊文献+

基于差分隐私的海量数据发布方法研究 预览

MASSIVE DATA PUBLISHING METHOD BASED ON DIFFERENTIAL PRIVACY
在线阅读 下载PDF
分享 导出
摘要 海量静态数据直方图发布过程中分组划分存在离群点,导致误差增大和离群点判定效率低的问题。对此提出一种适用于Spark框架的满足ε-差分隐私保护的海量静态数据直方图发布方法。对k-means聚类算法进行避免距离重复计算的优化改进;利用改进后的k-means聚类算法进行直方图最优分组划分,实现快速聚合相似分组,形成最优分组融合;对分组结果添加噪声处理,并将经过差分隐私保护处理后的数据进行发布。利用实际数据进行仿真实验,结果表明,所提方法在海量静态数据集隐私保护处理中可提高发布效率和保证数据隐私安全性,同时保证发布数据具有较好的可用性。 There are outliers in group partition in the process of massive static data histogram publishing,which may lead to increased errors and low efficiency of outlier decision.To solve this problem,we presented a histogram publishing method for massive static data satisfying differential privacy protection for Spark framework.k-means clustering algorithm was optimized to avoid distance duplication calculation.The improved k-means clustering algorithm was used to partition the histogram into the best grouping,and the similar grouping was quickly aggregated to form the optimal grouping fusion.We added noise to grouping results,and published data with differential privacy protection.The simulation experiment was carried out by real data.The results show that the proposed method can improve the publishing efficiency and ensure the data privacy security in the privacy protection processing of massive static data sets.It can also ensure the availability of publishing data.
作者 颜飞 张兴 李畅 李万杰 李帅 Yan Fei;Zhang Xing;Li Chang;Li Wanjie;Li Shuai(School of Electronics and Information Engineering,Liaoning University of Technology,Jinzhou 121001,Liaoning,China)
出处 《计算机应用与软件》 北大核心 2018年第11期314-320,共7页 Computer Applications and Software
基金 辽宁省高等学校杰出青年学者成长计划项目(LJQ2014066) 辽宁省自然科学基金项目(20170540434)。
关键词 差分隐私 分组融合 噪声干扰 数据发布 Differential privacy Grouping fusion Noise interference Data publishing
作者简介 颜飞,硕士生,主研领域:大数据安全,隐私保护;张兴,教授;李畅,硕士生;李万杰,硕士生;李帅,硕士生。
  • 相关文献

参考文献9

二级参考文献352

  • 1江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报:自然科学版,2011(S1):120-124. 被引量:71
  • 2王玲,薄列峰,焦李成.密度敏感的谱聚类[J].电子学报,2007,35(8):1577-1581. 被引量:50
  • 3Sweeney L. k anonymity: A model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowl- edge Based Systems, 2002, 10(5)= 557-570. 被引量:1
  • 4Machanavajjhala A, Gehrke J, Kifer D, Venkitasubrammiam M. l diversity: Privacy beyond k-anonymity//Proceedings of the 22nd International Conference on Data Engineering (ICDE). Atlanta, Georgia, USA, 2006:24-35. 被引量:1
  • 5Li N, Li T. t-closeness: Privacy beyond k anonymity and Z-diversity//Proceedings of the 23rd International Conference on Data Engineering (ICDE). Istanbul, Turkey, 2007: 106-115. 被引量:1
  • 6WongR CW, Li J, Fu A W, Wang K. (a,k) anonymity: An enhanced k anonymity model for privacy preserving data publishing//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). Philadelphia, Pennsylvania, USA, 2006: 754- 759. 被引量:1
  • 7Ganta S R, Kasiviswanathan S P, Smith A. Composition attacks and auxiliary information in data privacy//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD). New York, USA, 2008:265-273. 被引量:1
  • 8Wong P, C W, Fu A Wang K, et al. Can the utility of anon- ymized data be used for privacy breaches. ACM Transactions on Knowledge Discovery from Data, 2011, 5(3): 16. 被引量:1
  • 9Dwork C. DfferentiM prNacy//Proceedngs of the 33rd International Colloquium on Automata, Languages and Programming (ICALP). Venice, Italy, 2006:1-12. 被引量:1
  • 10Dwork C. Differential privacy A survey of results// Proceedings of the 5th International Conference on Theory and Applications of Models of Computation (TAMC). Xian, China, 2008:1 19. 被引量:1

共引文献163

投稿分析

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部 意见反馈