Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection
Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection
Self-supervised learning (SSL) has enabled Vision Transformers (ViTs) to learn robust representations from large-scale natural image datasets, enhancing their generalization across domains. In retinal imaging, foundation models pretrained on either natural or ophthalmic data have shown promise, but the benefits of in-domain pretraining remain uncertain. To investigate this, we benchmark six SSL-pretrained ViTs on seven digital fundus image (DFI) datasets totaling 70,000 expert-annotated images for the task of moderate-to-late age-related macular degeneration (AMD) identification. Our results show that iBOT pretrained on natural images achieves the highest out-of-distribution generalization, with AUROCs of 0.80-0.97, outperforming domain-specific models, which achieved AUROCs of 0.78-0.96 and a baseline ViT-L with no pretraining, which achieved AUROCs of 0.68-0.91. These findings highlight the value of foundation models in improving AMD identification and challenge the assumption that in-domain pretraining is necessary. Furthermore, we release BRAMD, an open-access dataset (n=587) of DFIs with AMD labels from Brazil.
Benjamin A. Cohen、Jonathan Fhima、Joachim A. Behar、Meishar Meisel、Baskin Meital、Luis Filipe Nakayama、Eran Berkowitz
眼科学医学研究方法
Benjamin A. Cohen,Jonathan Fhima,Joachim A. Behar,Meishar Meisel,Baskin Meital,Luis Filipe Nakayama,Eran Berkowitz.Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection[EB/OL].(2025-05-08)[2025-07-25].https://arxiv.org/abs/2505.05291.点此复制
评论