首页|Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps

Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps

来源：

英文摘要

Long-context language models (LCLMs), characterized by their extensive context window, are becoming popular. However, despite the fact that they are nearly perfect at standard long-context retrieval tasks, our evaluations demonstrate they fail in some basic cases. Later, we find they can be well addressed with a sufficient number of reasoning steps, guided by specific CoT prompts. This result emphasizes the potential necessity of solving specific long-context tasks using long-CoT methods, while previous long-context benchmarks always ignore the necessity of long reasoning for long-context tasks and treat them as direct QA tasks.

作者：Yijiong Yu、Yongfeng Huang、Zhixiao Qi、Wei Wang、Weifeng Liu、Ran Chen、Ji Pei

作者单位：

学科分类：语言学

推荐引用：Yijiong Yu,Yongfeng Huang,Zhixiao Qi,Wei Wang,Weifeng Liu,Ran Chen,Ji Pei.Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps[EB/OL].(2025-08-26)[2025-09-06].https://arxiv.org/abs/2410.04422.点此复制

Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps

Long-context Language Models Fail in Basic Retrieval Tasks Without Sufficient Reasoning Steps

评论