Skip Navigation
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Internet Explorer is no longer a supported browser.

This website may not display properly with Internet Explorer. For the best experience, please use a more recent browser such as the latest versions of Google Chrome, Microsoft Edge, and/or Mozilla Firefox. Thank you.

COVID-19 is an emerging, rapidly evolving situation.

Get the latest public health information from CDC. Get the latest research information from NIH.

Your Environment. Your Health.

Publication Detail

Title: The Essential Toolbox of Data Science: Python, R, Git, and Docker.

Authors: Pittard, W Stephen; Li, Shuzhao

Published In Methods Mol Biol, (2020)

Abstract: The daily work in data science involves a set of essential tools: the programming languages Python and R, the version control tool Git and the virtualization tool Docker. Proficiency in at least one programming language is required for data science. R is tied to a computing environment that focuses on statistics, in which many new algorithms in genomics and biomedicine are first published. Python has a root in system administration, and is a superb language for general programming. Version control is critical to managing complex projects, even if software development is not involved. Docker container is becoming a key tool for deployment, portability, and reproducibility. This chapter provides a self-contained practical guide of these topics so that readers can use it as a reference and to plan their training.

PubMed ID: 31953823 Exiting the NIEHS site

MeSH Terms: No MeSH terms associated with this publication

Back
to Top