Improving Precision of RCT-Based CATE Estimation using Data Borrowing with Double Calibration
Improving Precision of RCT-Based CATE Estimation using Data Borrowing with Double Calibration
Understanding how treatment effects vary across patient characteristics is essential for personalized medicine, yet randomized controlled trials (RCTs) are often underpowered to detect heterogeneous treatment effects (HTEs). We propose a framework that improves the efficiency of conditional average treatment effect (CATE) estimation in RCTs by leveraging large observational studies (OS) while preserving the unbiasedness of RCT estimates. By framing CATE estimation as a supervised learning problem, we show that estimation variance is minimized using the counterfactual mean outcome (CMO) as an augmentation function. We derive finite-sample error bounds and establish conditions under which OS data improves CMO estimation, and thus CATE efficiency, even in the presence of confounding in the OS or outcome distribution shifts between populations. We introduce R-OSCAR (Robust Observational Studies for CMO-Augmented RCT), a two-stage estimator that calibrates OS outcome predictions to the RCT population and corrects residual biases through regularized regression. Simulations show that R-OSCAR can reduce the RCT sample size needed for HTE detection by up to 75%, maintaining robustness to model misspecification. Application to the Tennessee STAR study confirms these efficiency gains. Our framework offers a principled approach to integrating observational and experimental data using tools from statistical learning and transfer learning.
Amir Asiaee、Chiara Di Gravio、Cole Beck、Yuting Mei、Samhita Pal、Jared D. Huling
医学研究方法生物科学研究方法、生物科学研究技术
Amir Asiaee,Chiara Di Gravio,Cole Beck,Yuting Mei,Samhita Pal,Jared D. Huling.Improving Precision of RCT-Based CATE Estimation using Data Borrowing with Double Calibration[EB/OL].(2025-07-20)[2025-08-10].https://arxiv.org/abs/2306.17478.点此复制
评论