Kuwain 1.5B: An Arabic SLM via Language Injection
Kuwain 1.5B: An Arabic SLM via Language Injection
Enhancing existing models with new knowledge is a crucial aspect of AI development. This paper introduces a novel method for integrating a new language into a large language model (LLM). Our approach successfully incorporates a previously unseen target language into an existing LLM without compromising its prior knowledge. We trained a tiny model with 1.5 billion parameters named Kuwain by injecting the Arabic language into a small open-source model mainly trained in English. Our method demonstrates significant improvements in Arabic language performance, with an average 8% improvement across various benchmarks, while retaining the model's existing knowledge with a minimum amount of the original model's data. This offers a cost-effective alternative to training a comprehensive model in both English and Arabic. The results highlight the potential for efficient, targeted language model expansion without extensive retraining or resource-intensive processes.
Khalil Hennara、Sara Chrouf、Mohamed Motaism Hamed、Zeina Aldallal、Omar Hadid、Safwan AlModhayan
语言学闪-含语系(阿非罗-亚细亚语系)
Khalil Hennara,Sara Chrouf,Mohamed Motaism Hamed,Zeina Aldallal,Omar Hadid,Safwan AlModhayan.Kuwain 1.5B: An Arabic SLM via Language Injection[EB/OL].(2025-04-21)[2025-05-02].https://arxiv.org/abs/2504.15120.点此复制
评论