rox : A statistical model for regression with missing values
rox : A statistical model for regression with missing values
Abstract High-dimensional omics datasets frequently contain missing data points, which typically occur due to concentrations below the limit of detection (LOD) of the profiling platform. The presence of such missing values significantly limits downstream statistical analysis and result interpretation. Two common techniques to deal with this issue include the removal of samples with missing values, and imputation approaches which substitute the missing measurements with reasonable estimates. Both approaches, however, suffer from various shortcomings and pitfalls. In this paper, we present “rox”, a novel statistical model for the analysis of omics data with missing values without the need for imputation. The model directly incorporates missing values as “low” concentrations into the calculation. We show the superiority of rox over common approaches on simulated data and on six metabolomics datasets. Fully leveraging the information contained in LOD-based missing values, rox provides a powerful tool for the statistical analysis of omics data.
Buyukozkan Mustafa、Krumsiek Jan、Benedetti Elisa
Institute for Computational Biomedicine, Department of Physiology and Biophysics Weill Cornell MedicineInstitute for Computational Biomedicine, Department of Physiology and Biophysics Weill Cornell MedicineInstitute for Computational Biomedicine, Department of Physiology and Biophysics Weill Cornell Medicine
生物科学研究方法、生物科学研究技术
Buyukozkan Mustafa,Krumsiek Jan,Benedetti Elisa.rox : A statistical model for regression with missing values[EB/OL].(2025-03-28)[2025-05-29].https://www.biorxiv.org/content/10.1101/2022.04.15.488427.点此复制
评论