SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems
SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems
Research on interactive and conversational information access systems, including search engines, recommender systems, and conversational assistants, has been hindered by the difficulty in evaluating such systems with reproducible experiments. User simulation provides a promising solution, but there is a lack of infrastructure and tooling to support this kind of evaluation. To facilitate simulation-based evaluation of conversational information access systems, we introduce SimLab, the first cloud-based platform to provide a centralized general solution for the community to benchmark both conversational systems and user simulators in a controlled and reproducible environment. We articulate requirements for such a platform and propose a general infrastructure to address these requirements. We then present the design and implementation of an initial version of SimLab and showcase its features with an initial evaluation task of conversational movie recommendation, which is made publicly available. Furthermore, we discuss the sustainability of the platform and its future opportunities. This paper is a call for the community to contribute to the platform to drive progress in the field of conversational information access and user simulation.
Nolwenn Bernard、Sharath Chandra Etagi Suresh、Krisztian Balog、ChengXiang Zhai
计算技术、计算机技术
Nolwenn Bernard,Sharath Chandra Etagi Suresh,Krisztian Balog,ChengXiang Zhai.SimLab: A Platform for Simulation-based Evaluation of Conversational Information Access Systems[EB/OL].(2025-07-07)[2025-07-16].https://arxiv.org/abs/2507.04888.点此复制
评论