首页|Are LLMs Good Text Diacritizers? An Arabic and Yor\`ub\'a Case Study

Are LLMs Good Text Diacritizers? An Arabic and Yor\`ub\'a Case Study

来源：

英文摘要

We investigate the effectiveness of large language models (LLMs) for text diacritization in two typologically distinct languages: Arabic and Yoruba. To enable a rigorous evaluation, we introduce a novel multilingual dataset MultiDiac, with diverse samples that capture a range of diacritic ambiguities. We evaluate 14 LLMs varying in size, accessibility, and language coverage, and benchmark them against 6 specialized diacritization models. Additionally, we fine-tune four small open-source models using LoRA for Yoruba. Our results show that many off-the-shelf LLMs outperform specialized diacritization models for both Arabic and Yoruba, but smaller models suffer from hallucinations. Fine-tuning on a small dataset can help improve diacritization performance and reduce hallucination rates.

作者：Hawau Olamide Toyin、Samar M. Magdy、Hanan Aldarmaki

作者单位：

学科分类：闪-含语系（阿非罗-亚细亚语系）非洲诸语言

推荐引用：Hawau Olamide Toyin,Samar M. Magdy,Hanan Aldarmaki.Are LLMs Good Text Diacritizers? An Arabic and Yor\`ub\'a Case Study[EB/OL].(2025-06-13)[2025-06-25].https://arxiv.org/abs/2506.11602.点此复制

Are LLMs Good Text Diacritizers? An Arabic and Yor\`ub\'a Case Study

Are LLMs Good Text Diacritizers? An Arabic and Yor\`ub\'a Case Study

评论