Data Provenance

Definition

Data provenance can be defined as the origins, custody, and ownership of research data. Because datasets are used and reformulated or reworked to create new data, provenance is important to trace newly designed or repurposed data back to their original datasets. The concept of provenance guarantees that data creators are held accountable for their work, and provides a chain of information where data can be tracked as researchers use other researchers’ data and adapt it for their own purposes.

Further Resources

Bache R, Miles S, Coker B, & Taweel A. (2013). Informative Provenance for Repurposed Data: A Case Study using Clinical Research Data(link is external). International Journal of Digital Curation, 8(2), 27–46. doi.org/10.2218/ijdc.v8i2.262

Borgman CL. (2010). Research Data: Who will share what, with whom, when, and why?(link is external) China-North American Library Conference. Beijing, 21.

Drăgan L, Luczak-Rösch M, Simperl E, Packer H, & Moreau L. (2015). A-posteriori Provenance-enabled Linking of Publications and Datasets Via Crowdsourcing(link is external). D-Lib Magazine, 21(1/2). doi.org/10.1045/january2015-dragan

Giarlo MJ. (2012). Academic Libraries as Data Quality Hubs(link is external). State College, PA,  p. 1–20.

Lord P, Macdonald A. (2003). e-Science Curation Report Data curation for e-Science in the UK : an audit to establish requirements for future curation and provision(link is external).

Mayernik MS., DiLauro T, Duerr R., Metsger E., Thessen AE, & Choudhury GS. (2013). Data Conservancy Provenance, Context, and Lineage Services: Key Components for Data Preservation and Curation(link is external). Data Science Journal, 12, 158–171.

Stewart C. (2012). Preservation and Access in an Age of E-Science and Electronic Records: Sharing the Problem and Discovering Common Solutions(link is external). Journal of Library Administration, 52(3-4):265–78.

Simmhan YL, Plale B, & Gannon D. (2005). A survey of data provenance in e-science(link is external). ACM SIGMOD Record, 34(3):31.

Viglas S. (2013). Data Provenance and Trust(link is external). Data Science Journal, 12, GRDI58-GRDI64.

Search for a Term

Send us your feedback or suggestions for new terms

Contact information
CAPTCHA This question is to prevent spam submissions. Contact nwso@hshsl.umaryland.edu for any accessibility issues.
5 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.