|国家预印本平台
首页|Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning

来源:Arxiv_logoArxiv
英文摘要

Current self-supervised stereo matching relies on the photometric consistency assumption, which breaks down in occluded regions due to ill-posed correspondences. To address this issue, we propose BaCon-Stereo, a simple yet effective contrastive learning framework for self-supervised stereo network training in both non-occluded and occluded regions. We adopt a teacher-student paradigm with multi-baseline inputs, in which the stereo pairs fed into the teacher and student share the same reference view but differ in target views. Geometrically, regions occluded in the student's target view are often visible in the teacher's, making it easier for the teacher to predict in these regions. The teacher's prediction is rescaled to match the student's baseline and then used to supervise the student. We also introduce an occlusion-aware attention map to better guide the student in learning occlusion completion. To support training, we synthesize a multi-baseline dataset BaCon-20k. Extensive experiments demonstrate that BaCon-Stereo improves prediction in both occluded and non-occluded regions, achieves strong generalization and robustness, and outperforms state-of-the-art self-supervised methods on both KITTI 2015 and 2012 benchmarks. Our code and dataset will be released upon paper acceptance.

Peng Xu、Eryun Liu、Zhiyu Xiang、Jingyun Fu、Kai Wang、Tianyu Pu、Chaojie Ji、Tingming Bai

计算技术、计算机技术信息科学、信息技术

Peng Xu,Eryun Liu,Zhiyu Xiang,Jingyun Fu,Kai Wang,Tianyu Pu,Chaojie Ji,Tingming Bai.Self-Supervised Stereo Matching with Multi-Baseline Contrastive Learning[EB/OL].(2025-08-14)[2025-08-24].https://arxiv.org/abs/2508.10838.点此复制

评论