MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture
MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture
Hyperspectral image (HSI) classification faces challenges such as high-dimensional data, limited training samples, and spectral redundancy, which often lead to overfitting and insufficient generalization capability. This paper proposes a novel MVNet network architecture that integrates 3D-CNN's local feature extraction, Transformer's global modeling, and Mamba's linear complexity sequence modeling capabilities, achieving efficient spatial-spectral feature extraction and fusion. MVNet features a redesigned dual-branch Mamba module, including a State Space Model (SSM) branch and a non-SSM branch employing 1D convolution with SiLU activation, enhancing modeling of both short-range and long-range dependencies while reducing computational latency in traditional Mamba. The optimized HSI-MambaVision Mixer module overcomes the unidirectional limitation of causal convolution, capturing bidirectional spatial-spectral dependencies in a single forward pass through decoupled attention that focuses on high-value features, alleviating parameter redundancy and the curse of dimensionality. On IN, UP, and KSC datasets, MVNet outperforms mainstream hyperspectral image classification methods in both classification accuracy and computational efficiency, demonstrating robust capability in processing complex HSI data.
Guandong Li、Mengxia Ye
遥感技术
Guandong Li,Mengxia Ye.MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture[EB/OL].(2025-07-06)[2025-07-20].https://arxiv.org/abs/2507.04409.点此复制
评论