|国家预印本平台
首页|ViMMRC 2.0 -- Enhancing Machine Reading Comprehension on Vietnamese Literature Text

ViMMRC 2.0 -- Enhancing Machine Reading Comprehension on Vietnamese Literature Text

ViMMRC 2.0 -- Enhancing Machine Reading Comprehension on Vietnamese Literature Text

来源:Arxiv_logoArxiv
英文摘要

Machine reading comprehension has been an interesting and challenging task in recent years, with the purpose of extracting useful information from texts. To attain the computer ability to understand the reading text and answer relevant information, we introduce ViMMRC 2.0 - an extension of the previous ViMMRC for the task of multiple-choice reading comprehension in Vietnamese Textbooks which contain the reading articles for students from Grade 1 to Grade 12. This dataset has 699 reading passages which are prose and poems, and 5,273 questions. The questions in the new dataset are not fixed with four options as in the previous version. Moreover, the difficulty of questions is increased, which challenges the models to find the correct choice. The computer must understand the whole context of the reading passage, the question, and the content of each choice to extract the right answers. Hence, we propose a multi-stage approach that combines the multi-step attention network (MAN) with the natural language inference (NLI) task to enhance the performance of the reading comprehension model. Then, we compare the proposed methodology with the baseline BERTology models on the new dataset and the ViMMRC 1.0. From the results of the error analysis, we found that the challenge of the reading comprehension models is understanding the implicit context in texts and linking them together in order to find the correct answers. Finally, we hope our new dataset will motivate further research to enhance the ability of computers to understand the Vietnamese language.

Son T. Luu、Khoi Trong Hoang、Tuong Quang Pham、Kiet Van Nguyen、Ngan Luu-Thuy Nguyen

10.1142/S2717554525500043

常用外国语计算技术、计算机技术

Son T. Luu,Khoi Trong Hoang,Tuong Quang Pham,Kiet Van Nguyen,Ngan Luu-Thuy Nguyen.ViMMRC 2.0 -- Enhancing Machine Reading Comprehension on Vietnamese Literature Text[EB/OL].(2025-07-18)[2025-08-04].https://arxiv.org/abs/2303.18162.点此复制

评论