Big data is often characterized by its large volume, velocity, and/or variety. As more data are accumulated, the frame for what is considered big data changes. Big data may currently include datasets in the Terabytes (TBs, 1012), Petabytes (PBs, 1015 bytes), or larger in size but current personal computers can handle the processing and/or storage of a TB of data and in the future they may be able to do the same for PBs of data. The velocity is the fast rate in which the data accumulates and the variety is derived from the variety of formats and unstructured condition of the majority of big data.
Philip E. Bourne (2014) would expand this definition to include “the emergence of the digital enterprise – the ability for an organization to take full advantage of its digital assets – which collectively can be described as large amount of data and more.”
Bourne PE. (2014). What Big Data means to me. Journal of the American Medical Informatics Association, 21(2), 194–194. doi.org/10.1136/amiajnl-2014-002651.
Carpenter J, Crutchley P, Zilca RD, Schwartz HA, Smith LK, Cobb AM, & Parks AC. (2016). Seeing the “Big” Picture: Big Data Methods for Exploring Relationships Between Usage, Language, and Outcome in Internet Intervention Data. Journal of Medical Internet Research, 18(8), e241. doi.org/10.2196/jmir.5725
Crosas M, King G, Honaker J, & Sweeney L. (2014). Automating Open Science for Big Data. The ANNALS of the American Academy of Political and Social Science, 659(1), 260–273.
Gandomi A, Haider M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35, 137–144. doi.org/10.1016/j.ijinfomgt.2014.10.007
Laney D. (2001) 3D data management: Controlling data volume, velocity and variety.
Marcinkowski M, Fonseca F. (2016). The conditions of peak empiricism in big data and interaction design. Journal of the Association for Information Science & Technology, 67(6), 1279–1288. dx.doi.org/10.1002/asi.23497.
Margolis R, Derr L, Dunn M, Huerta M, Larkin J, Sheehan J, … Green ED. (2014). The National Institutes of Health’s Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data. Journal of the American Medical Informatics Association : JAMIA, 21(6), 957–958. doi.org/10.1136/amiajnl-2014-002974
Paten B, Diekhans M, Druker BJ, Friend S, Guinney J, Gassner N, … Haussler D. (2015). The NIH BD2K center for big data in translational genomics. Journal of the American Medical Informatics Association, ocv047. doi.org/10.1093/jamia/ocv047.
Pouchard L. (2015). Revisiting the Data Lifecycle with Big Data Curation. International Journal of Digital Curation, 10(2). doi.org/10.2218/ijdc.v10i2.342
Toga AW, Foster I, Kesselman C, Madduri R, Chard K, Deutsch EW, … Hood L. (2015). Big Biomedical data as the key resource for discovery science. Journal of the American Medical Informatics Association, ocv077. doi.org/10.1093/jamia/ocv077.