首页|Effortless Vision-Language Model Specialization in Histopathology without Annotation

Effortless Vision-Language Model Specialization in Histopathology without Annotation

来源：

英文摘要

Recent advances in Vision-Language Models (VLMs) in histopathology, such as CONCH and QuiltNet, have demonstrated impressive zero-shot classification capabilities across various tasks. However, their general-purpose design may lead to suboptimal performance in specific downstream applications. While supervised fine-tuning methods address this issue, they require manually labeled samples for adaptation. This paper investigates annotation-free adaptation of VLMs through continued pretraining on domain- and task-relevant image-caption pairs extracted from existing databases. Our experiments on two VLMs, CONCH and QuiltNet, across three downstream tasks reveal that these pairs substantially enhance both zero-shot and few-shot performance. Notably, with larger training sizes, continued pretraining matches the performance of few-shot methods while eliminating manual labeling. Its effectiveness, task-agnostic design, and annotation-free workflow make it a promising pathway for adapting VLMs to new histopathology tasks. Code is available at https://github.com/DeepMicroscopy/Annotation-free-VLM-specialization.

作者：Jingna Qiu、Nishanth Jain、Jonas Ammeling、Marc Aubreville、Katharina Breininger

作者单位：

学科分类：医学研究方法

推荐引用：Jingna Qiu,Nishanth Jain,Jonas Ammeling,Marc Aubreville,Katharina Breininger.Effortless Vision-Language Model Specialization in Histopathology without Annotation[EB/OL].(2025-08-11)[2025-08-24].https://arxiv.org/abs/2508.07835.点此复制

Effortless Vision-Language Model Specialization in Histopathology without Annotation

Effortless Vision-Language Model Specialization in Histopathology without Annotation

评论