CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection
CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection
Convolutional neural networks (CNNs) have long been the cornerstone of target detection, but they are often limited by limited receptive fields, which hinders their ability to capture global contextual information. This paper believes that the effective utilization of extracted features is as important as the feature extraction process itself. We critically re-evaluated the DETR-inspired header network architecture, questioning the indispensable nature of its self-attention mechanism, and discovering significant information redundancies. To solve these problems, we introduced the Context-Gated Scale-Adaptive Detection Network (CSDN), a Transformer-based detection header inspired by natural language processing architecture and human visual perception. CSDN aims to efficiently utilize the characteristics of the CNN backbone network by replacing the traditional stacked self-attention and cross-attention layers with a novel gating mechanism. This mechanism enables each region of interest (ROI) to adaptively select and combine feature dimensions and scale information from multiple attention patterns. CSDN provides more powerful global context modeling capabilities and can better adapt to objects of different sizes and structures. Our proposed detection head can directly replace the native heads of various CNN-based detectors, and only a few rounds of fine-tuning on the pre-training weights can significantly improve the detection accuracy, thus avoiding the need to achieve small improvements. Various layer modules undergo extensive re-training.
Wei Haolin
计算技术、计算机技术
Wei Haolin.CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection[EB/OL].(2025-06-21)[2025-07-01].https://arxiv.org/abs/2506.17679.点此复制
评论