Title: ODAL: A one-shot distributed algorithm to perform logistic regressions on electronic health records data from multiple clinical sites.
Authors: Duan, Rui; Boland, Mary Regina; Moore, Jason H; Chen, Yong
Published In Pac Symp Biocomput, (2019)
Abstract: Electronic Health Records (EHR) contain extensive information on various health outcomes and risk factors, and therefore have been broadly used in healthcare research. Integrating EHR data from multiple clinical sites can accelerate knowledge discovery and risk prediction by providing a larger sample size in a more general population which potentially reduces clinical bias and improves estimation and prediction accuracy. To overcome the barrier of patient-level data sharing, distributed algorithms are developed to conduct statistical analyses across multiple sites through sharing only aggregated information. The current distributed algorithm often requires iterative information evaluation and transferring across sites, which can potentially lead to a high communication cost in practical settings. In this study, we propose a privacy-preserving and communication-efficient distributed algorithm for logistic regression without requiring iterative communications across sites. Our simulation study showed our algorithm reached comparative accuracy comparing to the oracle estimator where data are pooled together. We applied our algorithm to an EHR data from the University of Pennsylvania health system to evaluate the risks of fetal loss due to various medication exposures.
PubMed ID: 30864308
MeSH Terms: Algorithms*; Computational Biology/methods*; Computer Communication Networks/statistics & numerical data; Computer Simulation; Drug-Related Side Effects and Adverse Reactions; Electronic Health Records/statistics & numerical data*; Female; Fetal Death/etiology; Humans; Infant, Newborn; Information Dissemination; International Classification of Diseases; Likelihood Functions; Logistic Models*; Medical Informatics/methods; Pregnancy