Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control
Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control
Deep neural network (DNN)-based policy models, such as vision-language-action (VLA) models, excel at automating complex decision-making from multi-modal inputs. However, scaling these models greatly increases computational overhead, complicating deployment in resource-constrained settings like robot manipulation and autonomous driving. To address this, we propose Saliency-Aware Quantized Imitation Learning (SQIL), which combines quantization-aware training with a selective loss-weighting strategy for mission-critical states. By identifying these states via saliency scores and emphasizing them in the training loss, SQIL preserves decision fidelity under low-bit precision. We validate SQIL's generalization capability across extensive simulation benchmarks with environment variations, real-world tasks, and cross-domain tasks (self-driving, physics simulation), consistently recovering full-precision performance. Notably, a 4-bit weight-quantized VLA model for robotic manipulation achieves up to 2.5x speedup and 2.5x energy savings on an edge GPU with minimal accuracy loss. These results underline SQIL's potential for efficiently deploying large IL-based policy models on resource-limited devices.
Seongmin Park、Hyungmin Kim、Sangwoo Kim、Wonseok Jeon、Juyoung Yang、Byeongwook Jeon、Yoonseon Oh、Jungwook Choi
自动化技术、自动化技术设备计算技术、计算机技术
Seongmin Park,Hyungmin Kim,Sangwoo Kim,Wonseok Jeon,Juyoung Yang,Byeongwook Jeon,Yoonseon Oh,Jungwook Choi.Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control[EB/OL].(2025-05-21)[2025-06-12].https://arxiv.org/abs/2505.15304.点此复制
评论