Superfund Research Program


December 2023

By Mali Velasco

FAIR Data Principles
The FAIR Data Principles are a set of guidelines for data to be Findable, Accessible, Interoperable, and Reusable by others. (Image courtesy of National Library of Medicine)

A team of scientists funded by the NIEHS Superfund Research Program (SRP) published a new workflow to help researchers across disciplines share environmental health data more effectively. The workflow provides a standardized framework for collecting, organizing, and distributing scientific data so that it can be more easily understood and used by other groups.

The team, which includes SRP center investigators from Michigan State University, the University of Kentucky, the University of Louisville, and the University of Iowa, formed as part of an SRP initiative to foster data sharing and reuse.

Outlining the Workflow

The new workflow adheres to the FAIR data principles, which suggest data should be findable, accessible, interoperable, and reusable. These elements are key to data management and sharing initiatives across diverse organizations and government agencies, including NIEHS. The collaborators recommended that FAIR data management be implemented at a project’s outset to improve efficiency, but the workflow can be implemented at any point.

The workflow entails a six-step process for capturing information on experimental designs, models, and endpoints — collectively referred to as metadata.

  • Library of reporting standards. The first step suggests creating a centralized resource, or library, to catalogue standard data elements that must be reported, including languages, terms, and allowable values. The authors explained that this library should be user-friendly, such as by assigning metadata tags that allow researchers to easily search for standards.
  • Template creation. The second step highlights the need to develop templates that specify the minimal metadata requirements to achieve FAIR principles. The research team explained that each research area has distinct sets of expected metadata, so developing different templates for different scientific fields is needed.
  • Facilitating metadata collection. Knowing what metadata is needed and how to collect it requires knowledge and training in data management and bioinformatics. As the third step, the authors suggest creating software tools to guide researchers without data management or bioinformatics expertise on what metadata to collect and how to do so effectively.
  • Structuring and sharing metadata. In steps four and five the authors explain that metadata needs to be appropriately structured, standardized, and must include all relevant information, such as which metadata collection resource was used, before data is deposited into repositories to be accessed by other scientists. They emphasize that repositories should be publicly accessible and include tools to convert data into different required formats.
  • Promoting adoption through validation. As the last step, the collaborators suggest implementing tools to assess the quality of metadata against expected requirements. They explained that these evaluation tools could increase confidence in the quality of shared datasets and accelerate their reuse.

“By adopting a workflow that prioritizes the needs of data generators and strategically addresses both publishers’ and data users’ requirements, I believe we can facilitate FAIR data production, improve validation of metadata by publishers, and accelerate the reuse of research data to answer novel questions,” said lead study author Rance Nault, Ph.D., of the Michigan State University SRP Center.

Redefining the Data Management and Sharing Culture

In addition to outlining this workflow, the researchers stressed that widespread adoption of better data-sharing habits depends on changing the culture around data management.

Journals requiring appropriate metadata before manuscript submission, funding agencies assessing adherence to the FAIR principles in grant applications, and broad training programs were noted as potential drivers of change.