Abstract

Developments in genome-wide association studies and the increasing availability of summary genetic association data have made the application of two-sample Mendelian Randomization (MR) with summary data increasingly popular. Conventional two-sample MR methods often employ the same sample for selecting relevant genetic variants and for constructing final causal estimates. Such a practice often leads to biased causal effect estimates due to the well known ``winner's curse" phenomenon. To address this fundamental challenge, we first examine the consequence of winner's curse on causal effect estimation both theoretically and empirically. We then propose a novel framework that systematically breaks the winner's curse, leading to unbiased association effect estimates for the selected genetic variants. Building upon the proposed framework, we introduce a novel rerandomized inverse variance weighted estimator that is consistent when selection and parameter estimation are conducted on the same sample. Under appropriate conditions, we show that the proposed RIVW estimator for the causal effect converges to a normal distribution asymptotically and its variance can be well estimated. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical examples.

Video Recording