|国家预印本平台
首页|CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion

CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion

CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion

来源:Arxiv_logoArxiv
英文摘要

Cross-domain few-shot object detection (CD-FSOD) aims to detect novel objects across different domains with limited class instances. Feature confusion, including object-background confusion and object-object confusion, presents significant challenges in both cross-domain and few-shot settings. In this work, we introduce CDFormer, a cross-domain few-shot object detection transformer against feature confusion, to address these challenges. The method specifically tackles feature confusion through two key modules: object-background distinguishing (OBD) and object-object distinguishing (OOD). The OBD module leverages a learnable background token to differentiate between objects and background, while the OOD module enhances the distinction between objects of different classes. Experimental results demonstrate that CDFormer outperforms previous state-of-the-art approaches, achieving 12.9% mAP, 11.0% mAP, and 10.4% mAP improvements under the 1/5/10 shot settings, respectively, when fine-tuned.

Boyuan Meng、Xiaohan Zhang、Peilin Li、Zhe Wu、Yiming Li、Wenkai Zhao、Beinan Yu、Hui-Liang Shen

计算技术、计算机技术

Boyuan Meng,Xiaohan Zhang,Peilin Li,Zhe Wu,Yiming Li,Wenkai Zhao,Beinan Yu,Hui-Liang Shen.CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion[EB/OL].(2025-05-01)[2025-06-05].https://arxiv.org/abs/2505.00938.点此复制

评论