Data provenance can be defined as the origins, custody, and ownership of research data. Because datasets are used and reformulated or reworked to create new data, provenance is important to trace newly designed or repurposed data back to their original datasets. The concept of provenance guarantees that data creators are held accountable for their work, and provides a chain of information where data can be tracked as researchers use other researchers’ data and adapt it for their own purposes.
Bache R, Miles S, Coker B, & Taweel A. (2013). Informative Provenance for Repurposed Data: A Case Study using Clinical Research Data(link is external). International Journal of Digital Curation, 8(2), 27–46. doi.org/10.2218/ijdc.v8i2.262
Borgman CL. (2010). Research Data: Who will share what, with whom, when, and why?(link is external) China-North American Library Conference. Beijing, 21.
Drăgan L, Luczak-Rösch M, Simperl E, Packer H, & Moreau L. (2015). A-posteriori Provenance-enabled Linking of Publications and Datasets Via Crowdsourcing(link is external). D-Lib Magazine, 21(1/2). doi.org/10.1045/january2015-dragan
Giarlo MJ. (2012). Academic Libraries as Data Quality Hubs(link is external). State College, PA, p. 1–20.
Mayernik MS., DiLauro T, Duerr R., Metsger E., Thessen AE, & Choudhury GS. (2013). Data Conservancy Provenance, Context, and Lineage Services: Key Components for Data Preservation and Curation(link is external). Data Science Journal, 12, 158–171.
Stewart C. (2012). Preservation and Access in an Age of E-Science and Electronic Records: Sharing the Problem and Discovering Common Solutions(link is external). Journal of Library Administration, 52(3-4):265–78.
Simmhan YL, Plale B, & Gannon D. (2005). A survey of data provenance in e-science(link is external). ACM SIGMOD Record, 34(3):31.
Viglas S. (2013). Data Provenance and Trust(link is external). Data Science Journal, 12, GRDI58-GRDI64.