首页|Text-guided multi-stage cross-perception network for medical image segmentation

Text-guided multi-stage cross-perception network for medical image segmentation

来源：

英文摘要

Medical image segmentation plays a crucial role in clinical medicine, serving as a tool for auxiliary diagnosis, treatment planning, and disease monitoring, thus facilitating physicians in the study and treatment of diseases. However, existing medical image segmentation methods are limited by the weak semantic expression of the target segmentation regions, which is caused by the low contrast between the target and non-target segmentation regions. To address this limitation, text prompt information has greast potential to capture the lesion location. However, existing text-guided methods suffer from insufficient cross-modal interaction and inadequate cross-modal feature expression. To resolve these issues, we propose the Text-guided Multi-stage Cross-perception network (TMC). In TMC, we introduce a multistage cross-attention module to enhance the model's understanding of semantic details and a multi-stage alignment loss to improve the consistency of cross-modal semantics. The results of the experiments demonstrate that our TMC achieves a superior performance with Dice of 84.77%, 78.50%, 88.73% in three public datasets (QaTa-COV19, MosMedData and Breast), outperforming UNet based networks and text-guided methods.

作者：Gaoyu Chen

作者单位：

学科分类：医学研究方法医学现状、医学发展

推荐引用：Gaoyu Chen.Text-guided multi-stage cross-perception network for medical image segmentation[EB/OL].(2025-06-09)[2025-06-27].https://arxiv.org/abs/2506.07475.点此复制

Text-guided multi-stage cross-perception network for medical image segmentation

Text-guided multi-stage cross-perception network for medical image segmentation

评论