Definition
A data dictionary is a document that outlines the structure, content, and meaning of a given variable. This includes what type of data is being collected (e.g. free text, numerical, categorical or group data), the full wording of a question, what values are allowable (e.g. numeric ranges, multiple choice codes), and what those values mean (e.g. 0 = no high blood pressure diagnosis, 1 = borderline high blood pressure, 2 = high blood pressure). A data dictionary is a critical tool for data analysis and reproducibility.
The term codebook is often used interchangeably with data dictionary, though the data dictionary can contain more information about the structure of a database. In the widely used data collection tool, REDCap, the data dictionary is a CSV file containing information on the variables and the structure of the REDCap database, while the codebook is a human readable document that provides information on each data element.
National Health Interview Survey Codebook at: https://www.cdc.gov/nchs/nhis/2020nhis.htm
Health Information National Trends Survey Codebooks available in data and supporting documentation downloads: https://hints.cancer.gov/data/download-data.aspx
Similar Terms
Relevant Literature
ICPSR: What is a Codebook:
https://www.icpsr.umich.edu/web/ICPSR/cms/1983
USGS: Data Dictionaries:
https://www.usgs.gov/data-management/data-dictionaries