Data Organization

Organizing Projects within a Research Campaign

A Research Campaign is a funded research program composed of multiple projects, investigators, and organizations working collaboratively to collect, share, analyze, or disseminate scientific results.

Projects are analogous to a file-storage directory but are more flexible and can hold their own metadata. Within a campaign, most projects contain a collection of files that can all be described with similar data collection methods, and which typically come from the same funded effort.

Smaller, focused projects may contain only a single dataset, while larger projects that collect or produce multiple types of data may contain several datasets. Identifying the specific dataset(s) that will be produced by a project is a central aim of project data management planning, and is necessary for planning an organizational structure within a project.

Projects can also be used to share information that doesn’t require metadata, such as administration or outreach materials.

Project Naming

Project titles should be as concise as possible while still containing key information about the dataset. The title is often the most important piece of metadata describing a resource. It is the first thing seen by people when browsing or searching for a resource, and may be the only information used to evaluate the content of the resource.

At a minimum, project titles should contain the following information:

  • Location
  • Data type
  • Year (or other time unit) range
  • Program or institution name, if your dataset is part of a large effort

Naming Examples

Poor project titles:

  • Plankton data
  • ROV data from Sue

Better project titles:

  • Plankton Diversity Data, Prince William Sound, 2012-2016
  • Conductivity, temperature and depth data for 12 northwestern Gulf of Mexico locations, May to July 2012
  • SAFARI 2000 Upper Water Column Profiles, Gulf of Alaska, 2011-2012

Labelling (Tagging) Projects

Assigning labels (a.k.a., tags) to projects is an effective way to categorize projects and make searching within a Research Campaign easier. Labels act like keywords assigned to projects, which allows you to organize your files without making endless layers of folders and subfolders. More than one label can be added to a project.

The following considerations will help you set-up an effective label structure for a Research Campaign:

  • Create high-level labels that divide content into general categories.
  • Limit label names to two words or less, and strive for no more than 10 total labels within a campaign.
  • Add labels for year, season, etc., if the campaign spans multiple, distinct periods or collection or collaboration.
  • Strive for consistency when assigning labels across projects within the campaign.

Organizing Folders within a Project

Folder Structure

Folders are an important way to organize your project files into smaller, easier-to-manage, and identifiable units. Create a logical folder structure to help you stay organized and easily find and retrieve your stored files, and initiate it at the beginning of your project to save time and frustration.

Note

Avoid complex, deeply-hierarchical folder structures, which require extra browsing for file storage and retrieval. Try to keep the folder levels to no more than three deep. Folder structures can be simplified by including all the essential information concisely in the file name.

The following best practices are recommended for creating an effective project folder structure:

  • Organize folders by major project components.

  • Create a hierarchical system with nested subfolders (high-level folders for broad topics with more specific folders within). Examples of high-level folder topics include:

    • Images
    • Data files
    • Project admin documents
    • PDFs of related literature
  • Organize the data by data type and then by research activity.

  • Separate preliminary and final data into different folder structures.

  • Be consistent with your folder organization throughout the life of your project and/or Research Campaign.

Good folder structure example

_images/folder_structure_example.png

Level of Granularity

It may be unrealistic to anticipate and pre-create every folder that will be needed for a project. Instead, consider the level of folder hierarchy that will provide sufficient structure for users and collaborators on your project to create their own subfolders.

A good approach is to establish the first one or two levels in the hierarchy, then let your collaborators create subfolders for lower levels as needed.

Granularity Examples

  • Project: Sea Monkey Forage Study

    • Parent folder: Prey Data

      • Child folder: 2017

        • Users can create subfolders within as needed
      • Child folder: 2018

        • Users can create subfolders within as needed

Folder Naming

How you name folders will have an impact on you and your collaborator’s ability to find and understand the folder contents. Naming folders consistently and descriptively will help users identify records at a glance, and will help to facilitate the storage and retrieval of data.

Folder names should adhere to the following best practices:

  • Rename default folder names generated by the Research Workspace with descriptive titles.
  • Name folders according to the areas of work to which they relate, and not after individuals. Classify file types with broad folder names.
  • Use folder names that are unambiguous and meaningfully describe the folder contents to you and your collaborators.
  • Be consistent when developing a naming scheme. Ideally, a scheme is created at the start of a project and used consistently throughout.
  • Avoid extra long folder names, but use information-rich file names instead (refer to File Naming).
  • Try to avoid duplicate folder names or paths. For example, if a folder is named “Photos” in one directory, don’t create a subfolders elsewhere named “Images”.

Examples of folder names

Poor folder names:

  • My Data
  • My Folder

Better folder names:

  • Processed herring acoustic summaries, 2012-2016
  • Raw herring acoustic data, 2012-2016