|国家预印本平台
首页|Toward Understanding Bugs in Vector Database Management Systems

Toward Understanding Bugs in Vector Database Management Systems

Toward Understanding Bugs in Vector Database Management Systems

来源:Arxiv_logoArxiv
英文摘要

Vector database management systems (VDBMSs) play a crucial role in facilitating semantic similarity searches over high-dimensional embeddings from diverse data sources. While VDBMSs are widely used in applications such as recommendation, retrieval-augmented generation (RAG), and multimodal search, their reliability remains underexplored. Traditional database reliability models cannot be directly applied to VDBMSs because of fundamental differences in data representation, query mechanisms, and system architecture. To address this gap, we present the first large-scale empirical study of software defects in VDBMSs. We manually analyzed 1,671 bug-fix pull requests from 15 widely used open-source VDBMSs and developed a comprehensive taxonomy of bugs based on symptoms, root causes, and developer fix strategies. Our study identifies five categories of bug symptoms, with more than half manifesting as functional failures. We further reveal 31 recurring fault patterns and highlight failure modes unique to vector search systems. In addition, we summarize 12 common fix strategies, whose distribution underscores the critical importance of correct program logic. These findings provide actionable insights into VDBMS reliability challenges and offer guidance for building more robust future systems.

Yinglin Xie、Xinyi Hou、Yanjie Zhao、Shenao Wang、Kai Chen、Haoyu Wang

计算技术、计算机技术

Yinglin Xie,Xinyi Hou,Yanjie Zhao,Shenao Wang,Kai Chen,Haoyu Wang.Toward Understanding Bugs in Vector Database Management Systems[EB/OL].(2025-06-03)[2025-07-16].https://arxiv.org/abs/2506.02617.点此复制

评论