Skip Navigation
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Internet Explorer is no longer a supported browser.

This website may not display properly with Internet Explorer. For the best experience, please use a more recent browser such as the latest versions of Google Chrome, Microsoft Edge, and/or Mozilla Firefox. Thank you.

Your Environment. Your Health.

University of California-Berkeley

Superfund Research Program

Computational Biology

Project Leader: Mark J. van der Laan
Co-Investigator: Alan E. Hubbard
Grant Number: P42ES004705
Funding Period: 2006-2017

Learn More About the Grantee

Visit the grantee's eNewsletter page Visit the grantee's Twitter page Visit the grantee's Facebook page

Project Summary (2006-2011)

The support provided under this Core reflects a growing trend in studies of environmental exposure from more traditional epidemiological studies and simple experimental designs to high-dimensional biology, with its emphasis on 'omic' technologies and complicated questions addressing the possible interaction of environmental exposures and high-dimensional measures of the genome, proteome, etc. These high-dimensional data sets are characterized by many (thousands) measurements made on only a few independent units (e.g., people). Thus, the Core reflects a parallel evolution in the field of biostatistics towards developing methodologies that can both find patterns in high dimensional data sets as well as providing proper statistical inference for these patterns. Besides offering consulting on traditional epidemiological experimental design and analysis questions, the Core focuses its efforts on providing the most relevant and rigorous statistical techniques to the Program. With new 'omic' technologies, biology has entered a new more empirical phase where the goals of the research are ambitious (e.g., discovery of regulatory gene networks affected by particular environmental toxicants), but the sample sizes relatively small (biological replicates numbering in the tens). With these technologies, have come also a proliferation of proposed methods to find biologically meaningful patterns and typically little theory is provided to guide their relative worth. The goal of this Core is to provide the project researchers with the best techniques available, software to help implement them, a computational environment that can handle computer-intensive methods on large data sets and, most importantly, rigorous statistical inference for the parameters estimated by these procedures. A subset of the developments related to the proliferation of high-dimensional biological/epidemiological data particularly relevant to this core are 1) multiple testing, 2) machine-learning and loss-based estimation, 3) grouping algorithms methods, 4) causal inference and 5) biological metadata and systems biology. In addition, the Core provides access to a computational environment that lends itself to the computationally intensive methods developed for data mining and re-sampling based inference.

Back
to Top