I九思:用大语言模型焕新古汉语之美
I Jiusi: Rejuvenating the Beauty of Ancient Chinese with Large Language Model
目的/意义]随着生成式人工智能(AIGC)的快速发展,大语言模型在通用领域展现出强大的语言理解和生成能力,但在古代汉语处理领域仍存在诸多局限。为应对这一挑战,华中科技大学研发了古汉语认知大语言模型“AI九思”,旨在增强大语言模型在古汉语知识问答和理解应用方面的专业能力。[方法/过程]本文详细介绍了“AI九思”的研发背景、数据集构建、模型训练过程及其在古汉语语言知识和语言能力方面的表现。[结果/结论]通过内测反馈,“AI九思”在古汉语专业问答和理解应用任务上展现了显著优势,但也存在一些待改进之处。未来,团队计划进一步提升“AI九思”的文本认知和多模态应用能力,优化用户交互体验,推动古汉语大语言模型向更高层次发展,促进古汉语研究向数智化阶段迈进。
[Purpose/Significance] With the rapid development of Generative Artificial Intelligence (AIGC), large language models have demonstrated powerful language understanding and generation capabilities in general domains. However, they still face many limitations in the field of ancient Chinese processing. To address this challenge, Huazhong University of Science and Technology has developed the "AI Jiusi," a large language model for cognition of ancient Chinese, aiming to enhance the professional capabilities of LLM in knowledge question-answering and comprehension applications related to ancient Chinese. [Method/Process] This paper provides a detailed introduction to the research and development background, dataset construction, model training process, and performance in terms of ancient Chinese language knowledge and linguistic ability of "AI Jiusi." [Results/Conclusions] Based on internal testing feedback, "AI Jiusi" has shown significant advantages in professional question-answering and comprehension application tasks related to ancient Chinese, although there are areas that need improvement. In the future, the team plans to further enhance the text cognition and multimodal application capabilities of "AI Jiusi," optimize user interaction experience, and promote the development of LLMs for ancient Chinese to a higher level, facilitating the transition of ancient Chinese research into the digital and intelligent phase.
社会科学
I九思古汉语数智化大语言模型多模态
I JiuSiAncient ChineseDigitalization and IntelligenceLarge Language ModelMultimodal
.I九思:用大语言模型焕新古汉语之美[EB/OL].(2025-01-23)[2025-02-05].https://chinaxiv.org/abs/202501.00212.点此复制
评论