Data Management Cheat Sheet

Managing data responsibly isn’t easy, even for simple scientific projects. For large projects and campaigns, it quickly begins to feel overhwelming. Good data management requires planning, communication, and will power—but the rewards are well worth the effort. This page is designed to provide a manageable amount of information to get you started with data management, and to serve as a handy reference for things like file formatting and naming conventions.

For more detailed information, please consult our full Data Management Best Practices page.

And if you read through the information below and still find yourself needing help, please email us at metadata@axiomdatascience.com.

Data Organization

Data and File Formatting

Common File Format Specs

The table below outlines specifications for some common data file formats. For all formats, follow CF Conventions for naming whenever possible. If the CF Conventions don’t cover a name used in your project, refer to the Marine Metadata Interoperability Ontology Registry and Repository.

File Format Specifications
NetCDF
CSV
  • Follow CF Conventions for column names.
  • Double-check that decimal values are displayed to the correct number of significant digits.
  • Make sure your columns have consistent data types.
Shapefiles
  • Be sure the appropriate projection is documented.
  • For vector data, include the coordinate reference information.
Databases
  • Most database formats will need to be converted to plain text for archiving (e.g. each table as a CSV file).
  • Include plain-text documentation of relevant table and field properties.
  • Capture table relationships in a diagram, which can be saved as a JPG file.
Spatial Media
  • For GPS-enabled video, include a table that connects latitude and longitude to video timestamp.
  • Verify that the timestamps in the table match the video timestamp for the full length of the video.
  • Include documentation of the video file format and resolution.
  • Include any still images during video in a row of the table, with corresponding latitude and longitude.
Sensor Data
  • Coming soon!

Data Quality Management

  • Assign specific quality assurance tasks to specific people involved in your project.
  • Define parameter names, units, and null value codes before collecting data.
  • Review all data for missing, anamolous, or invalid values immediately after collection.

Metadata and Documentation

  • Document how your data are collected, processed, and preserved at each stage of your project.
  • Budget time to prepare your data for long-term preservation once your data are finalized.