Abstract

Data science underpins modern AI and many advances in healthcare, yet human judgment permeates every stage of the data science life cycle. These judgment calls introduce hidden uncertainties that go well beyond sampling variability and drive many of the risks associated with AI.

We introduce veridical data science, grounded in three fundamental principles—Predictability, Computability, and Stability (PCS)—to make such uncertainties explicit and assessable and to aggregate reality-checked algorithms for better results. The PCS framework unifies and extends best practices in statistics and machine learning and is illustrated through healthcare applications, including identifying genetic drivers of heart disease, reducing cost of prostate cancer detection, and improving uncertainty quantification beyond standard conformal prediction.

Video Recording