Data dictionaries are used to provide detailed information about the contents of a dataset or database, such as the names of measured variables, their data types or formats, and text descriptions. A data dictionary provides a concise guide to understanding and using the data. Ideally, all CanWIN records for datasets and databases should include or point to a data dictionary. It is preferred that these data dictionaries be machine readable, in CSV format.
If your data are managed in a standard relational database you will likely be able to generate a data dictionary through your software. This will provide a document that is consistently formatted and contains what is needed for others to understand your data.
If your data are managed in spreadsheets, text files, or comma separated values, you will need to manually prepare a data dictionary. To support machine-readability, we recommend preparing your data dictionary as a spreadsheet. A data dictionary template can be found at the end of this section.
- If your data is stored in a relational database, it may be able to generate a data dictionary for you.
- If your data is stored in a spreadsheet, you will need to manually create a data dictionary.
The following are recommended guidelines for data dictionaries; not requirements. These guidelines are subject to change, as best practices are evolving.