|国家预印本平台
首页|QAEncoder: Towards Aligned Representation Learning in Question Answering Systems

QAEncoder: Towards Aligned Representation Learning in Question Answering Systems

QAEncoder: Towards Aligned Representation Learning in Question Answering Systems

来源:Arxiv_logoArxiv
英文摘要

Modern QA systems entail retrieval-augmented generation (RAG) for accurate and trustworthy responses. However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free approach to bridge this gap. Specifically, QAEncoder estimates the expectation of potential queries in the embedding space as a robust surrogate for the document embedding, and attaches document fingerprints to effectively distinguish these embeddings. Extensive experiments across diverse datasets, languages, and embedding models confirmed QAEncoder's alignment capability, which offers a simple-yet-effective solution with zero additional index storage, retrieval latency, training costs, or catastrophic forgetting and hallucination issues. The repository is publicly available at https://github.com/IAAR-Shanghai/QAEncoder.

Zhengren Wang、Qinhan Yu、Shida Wei、Zhiyu Li、Feiyu Xiong、Xiaoxing Wang、Simin Niu、Hao Liang、Wentao Zhang

计算技术、计算机技术

Zhengren Wang,Qinhan Yu,Shida Wei,Zhiyu Li,Feiyu Xiong,Xiaoxing Wang,Simin Niu,Hao Liang,Wentao Zhang.QAEncoder: Towards Aligned Representation Learning in Question Answering Systems[EB/OL].(2025-07-02)[2025-07-16].https://arxiv.org/abs/2409.20434.点此复制

评论