================================================== Best Practices for Metadata - Models/Computer Code ================================================== Types of models/computer code ----------------------------- Models, model-based projects, and model derived products should have documentation sufficient to allow someone with the appropriate skills and knowledge to generate comparable results using a similar method. There are generally 3 types of model-based projects, and knowing which type to focus on provides a starting for the metadata. +------------------------------+----------------------------------------+------------------------------------------------+ |Developing a standalone model |ReTooling or re-newing a previous model |Applying an existing model to different/new data| +==============================+========================================+================================================+ |Example: Predicting Sockeye |Example: Predicting Sockeye Salmon run |Example: Predicting Coho Salmon run timing | |Salmon run timing for the |timing in the Gulf of Alaska, in |in the Gulf of Alaska based on an existing, | |Salmon River (Gulf of Alaska) |Python 3 |geographically relevant model | +------------------------------+----------------------------------------+------------------------------------------------+ Begin at the beginning ---------------------- A clear understanding of the documentation needs from the start of a project, with agreement from major partners, will go a long way towards success. Examples of these needs include responsible data/output handling, process documentation, guidelines for quality control, and the metadata associated with each of these steps. Assess what it is that represents the scholarly output for your project. For example, if the model could be written in Python, R, MatLab, and the idea is the same, the metadata should be about the methods. If the model is applying an idea to a specific type of data or using a specific code/computation approach, adjust the metadata accordingly. Why document metadata? ---------------------- Aim for reusability. It is important to keep a future re-user in mind, and the multitude of possible re-use causes. It is typically safe to assume a little familiarity, but not a full understanding. Re-use could range from pre-processing, to examining what was left out, to a helpful process for building a further product. What to document? ----------------- Here is a table describing an overview of what to include, based on model types described here. +-----------------+------------------+----------------+----------------+ | |Type 1 |Type 2 |Type 3 | | |Standalone model |Updated model |Applied model | +=================+==================+================+================+ | Project-level | Yes | Yes | Yes | | documentation | | | | +-----------------+------------------+----------------+----------------+ | Input file(s) | Yes | Yes | Yes | +-----------------+------------------+----------------+----------------+ | Model code | Yes | Yes | Optional | +-----------------+------------------+----------------+----------------+ | Output file(s) | Optional | Optional | Yes | +-----------------+------------------+----------------+----------------+ Other considerations while creating metadata: --------------------------------------------- - Inputs should be documented, or their sources, but it is not typically necessary to include raw files - Parameters are typically a type of input file for your codeily included as a standalone file - Archive intermediary input files if they are original to the project - outputs from another model should not be included, but instead cited (see previous tip) - Make note of any original data processing technique, typically in the "Process steps" of a Methods section For further questions, reach out to Axiom team members at metadata@axiomdatascience.com.