Superfund Research Program
Data Management and Analysis Core
Project Summary (2020-2025)
Real-life exposures to hazardous substances from Superfund sites occur as mixtures of many contaminants. To improve and protect human health from exposure to hazardous substances, Superfund Research Programs (SRPs) must integrate biomedical research with environmental science and engineering. The integration of data from the diverse scientific disciplines in SRPs is critical if researchers are to fully understand the link between exposures and disease and prevent adverse health outcomes. Therefore, the data generated by the SRP represents an important research product that requires best practices for quality assurance, dissemination and interoperability. The primary objective of the Data Management and Analysis Core (DMAC) is to discover, implement and promulgate best practices for fostering and enabling the interoperability of data between biomedical research projects and environmental science and engineering projects to accomplish the goals of the overall Superfund Research Center. The DMAC will coordinate the development and refinement of an integrated data management plan for the entire Center and will work closely with project & core leaders to identify data sharing platforms and to prioritize datasets for sharing across the program. DMAC will establish data sharing guidelines and timelines and will also continue to provide expert statistics for experimental design and multivariate data analyses. DMAC will continue to develop and maintain software that provides an integrated data workflow for raw experimental data, important sample metadata, and downstream analysis pipelines as well as continue to model dose-response curves and biological response data. DMAC will customize the Experimental Data Management System with data templates for all projects and will implement new data visualizations in the Core’s Superfund Analytics Tool to facilitate data sharing across projects and cores. DMAC will apply novel deep learning algorithms to all processed data streams to link PAH exposure to outcome, and to ultimately predict the effects of PAH mixtures on biological systems. DMAC will work closely with project and core leaders to ensure high data quality throughout the lifecycle of data generation and will review data, document quality control procedures that account for experimental, technical, or systematic problems, and resolve problems at each step of the data life cycle. The Data Management and Analysis Core will integrate results across all research projects and cores and will train the next generation of toxicologists to analyze their own data.