首页|Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders

Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders

来源：

英文摘要

Most pre-trained Vision-Language (VL) models and training data for the downstream tasks are only available in English. Therefore, multilingual VL tasks are solved using cross-lingual transfer: fine-tune a multilingual pre-trained model or transfer the text encoder using parallel data. We study the alternative approach: transferring an already trained encoder using parallel data. We investigate the effect of parallel data: domain and the number of languages, which were out of focus in previous work. Our results show that even machine-translated task data are the best on average, caption-like authentic parallel data outperformed it in some languages. Further, we show that most languages benefit from multilingual training.

作者：Andrei-Alexandru Manea、Jind?ich Libovicky

作者单位：

学科分类：语言学常用外国语

推荐引用：Andrei-Alexandru Manea,Jind?ich Libovicky.Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders[EB/OL].(2025-04-30)[2025-05-28].https://arxiv.org/abs/2504.21681.点此复制

Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders

Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders

评论