期刊文献+

大数据可用性的研究进展 预览 被引量:24

State-of-the-Art of Research on Big Data Usability
在线阅读 下载PDF
收藏 分享 导出
摘要 信息技术的迅速发展,催生了大数据时代的到来.大数据已经成为信息社会的重要财富,为人们更深入地感知、认识和控制物理世界提供了前所未有的丰富信息.然而随着数据规模的扩大,劣质数据也随之而来,导致大数据质量低劣,极大地降低了大数据的可用性,严重困扰着信息社会.近年来,数据可用性问题引起了学术界和工业界的共同关注,展开了深入的研究,取得了一系列研究成果.介绍了数据可用性的基本概念,讨论数据可用性的挑战与研究问题,综述了数据可用性方面的研究成果,探索了大数据可用性的未来研究方向. The rapid development of information technology gives rise to the big data era. Big data has become an important wealth of information society, and has provided unprecedented rich information for people to further perceive, understand and control the physical world. However, withthe growth in data scale, dirty datacomes along. Dirty data leads to the low qualityand usability of big data, and seriously harms the information society. In recent years, the data usability problems have drawn the attentions of both the academia and industry. In-Depth studies have been conducted, and a series of research results have been obtained. This paper introduces the concept of data usability, discusses the challenges and research issues, reviews the research results and explories future research directions in this area.
作者 李建中 王宏志 高宏 LI Jian-Zhong, WANG Hong-Zhi, GAO Hong (School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)
出处 《软件学报》 EI CSCD 北大核心 2016年第7期1605-1625,共21页 Journal of Software
基金 国家重点基础研究发展计划(973)(2012CB316200) 国家自然科学基金(U1509216,61472099)
关键词 大数据 数据可用性 数据质量 数据清洗 数据管理 big data data usability data quality data cleaning data management
作者简介 通讯作者:李建中,E-mail:lijzh@hit.edu.cn李建中(1950-),男,黑龙江哈尔滨人,博士,教授,博士生导师,主要研究领域为海量数据管理与计算,无线传感器网络,数据质量. 王宏志0978-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为数据库,大数据,数据质量. 高宏(1966-),女,博士,教授,博士生导师,CCF高级会员,主要研究领域为海量数据计算,无线传感器网络.
  • 相关文献

参考文献187

  • 1RedmanT. The impact of poor data quality onthe typical enterprise. Communications of the ACM, 1998,41(2):79-82. [doi: 10. 1145/269012.269025]. 被引量:1
  • 2Miller DW, Yeast JD, Evans RL.Missing prenatal records at a birth center: A communication problemquantified. In: Proc. of the AMIA Annual Symp. Bethesda: American Medical Informatics Association, 2005. 535-539. 被引量:1
  • 3Swartz N. Gartner warns firms of dirty data. Information Management Journal, 2007,41(3):6. 被引量:1
  • 4To ERR is Human: Building a Safer Health System. Washington: National Academies Press, 2000. 被引量:1
  • 5Eckerson W. Data warehousing special report: Data quality and the bottom line. In: Proc. of the Applications Development Trends 2002. 被引量:1
  • 6English LP. Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits New York: Wiley, 1999. 被引量:1
  • 7Woolsey B, Schulz M. Credit card statistics, industry facts, debt statistics. In: Proc. of the Google Search Engine. 2010. 被引量:1
  • 8Shilakes C, Tylman J. Enterprise Information Portals. New York: Merrill Lynch, 1998. 被引量:1
  • 9Rahm E, Do HH. Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 2000,23(4):3-13. 被引量:1
  • 10Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 1996,12(4):5-34. [doi: 10.1080/07421222.1996.11518099]. 被引量:1

同被引文献162

引证文献24

二级引证文献36

投稿分析

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部 意见反馈