|国家预印本平台
首页|结合机器学习模型的早产儿ICU死亡风险评估与可解释性分析

结合机器学习模型的早产儿ICU死亡风险评估与可解释性分析

中文摘要英文摘要

目的 旨在利用机器学习算法预测早产儿ICU死亡风险,为临床医生提供早期诊断和风险评估的辅助决策工具。方法 回顾性地收集PIC数据库中早产儿病例的临床数据。按照ICU预后情况分为死亡组和生存组。基于LASSO回归分析和多因素Logistic回归分析的结果,筛选出可能影响早产儿预后的关键临床特征。研究通过SMOTE算法平衡数据,结合7种机器学习模型(如LightGBM、随机森林等),构建预测模型并评估其性能。使用 Shapley Additive Explanations (SHAP)算法进行模型解释。 结果 最终纳入患儿923人。生存组886人,死亡组37人,共收集38个临床特征。LASSO筛选出8个与早产儿ICU死亡密切相关的变量包括乳酸、氯离子浓度、中性粒细胞、红细胞分布宽度等。多因素Logistic回归分析显示:乳酸、呼吸频率是早产儿ICU预后的独立影响因素。LightGBM模型的AUC达到0.972,在准确性、精确性等指标上均优于其他模型同时通过SHAP分析提高了模型的解释性。研究结果显示,呼吸频率和乳酸对早产儿死亡风险的预测贡献最大。 结论 本研究为早产儿预后的早期识别和干预提供了可靠工具,强调了关键生理指标的重要性。未来需要多中心数据验证以增强模型的普适性,并进一步优化算法性能。

Objective: Aimed at using machine learning algorithms to predict the risk of neonatal ICU mortality, providing clinicians with an early diagnosis and risk assessment tool to assist in decision-making. Methods: Retrospectively collecting clinical data of preterm infants from the PIC database. Cases were divided into mortality and survival groups based on ICU outcomes. Key clinical characteristics potentially affecting preterm infant outcomes were screened using LASSO regression analysis and multivariate logistic regression analysis. The study balanced the data using the SMOTE algorithm and constructed predictive models using seven machine learning models (e.g., LightGBM, random forest), evaluating their performance. Model interpretation was performed using the Shapley Additive Explanations (SHAP) algorithm. Results: A total of 923 infants were included in the final analysis. The survival group comprised 886 infants, and the death group comprised 37 infants. A total of 38 clinical characteristics were collected. LASSO screening identified 8 variables significantly associated with neonatal ICU mortality, including lactate, chloride concentration, neutrophils, and red blood cell distribution width. Multivariate logistic regression analysis revealed that lactate and respiratory rate were independent predictors of neonatal ICU outcomes. The LightGBM model achieved an AUC of 0.972 and outperformed other models in terms of accuracy and precision. Furthermore, SHAP analysis enhanced model interpretability. The results indicated that respiratory rate and lactate contributed most significantly to the prediction of infant mortality risk. Conclusion: This study provides reliable tools for early identification and intervention of preterm infant outcomes, emphasizing the importance of key physiological indicators. Future multi-center data validation is needed to enhance the models generalizability and further optimize algorithm performance.

苏燕凤、洪素茹、陈钰霜、吴夏阳

厦门医学院附属第二医院厦门市儿童医院厦门市儿童医院厦门市儿童医院

10.12201/bmr.202503.00066

临床医学医学研究方法

早产儿、ICU死亡风险、机器学习、LightGBM模型、风险预测

Premature infants ICU mortality risk machine learning LightGBM model risk prediction

苏燕凤,洪素茹,陈钰霜,吴夏阳.结合机器学习模型的早产儿ICU死亡风险评估与可解释性分析[EB/OL].(2025-02-18)[2025-08-16].https://www.biomedrxiv.org.cn/article/doi/bmr.202503.00066.点此复制

评论