QAEncoder: Towards Aligned Representation Learning in Question Answering Systems
QAEncoder: Towards Aligned Representation Learning in Question Answering Systems
Modern QA systems entail retrieval-augmented generation (RAG) for accurate and trustworthy responses. However, the inherent gap between user queries and relevant documents hinders precise matching. We introduce QAEncoder, a training-free approach to bridge this gap. Specifically, QAEncoder estimates the expectation of potential queries in the embedding space as a robust surrogate for the document embedding, and attaches document fingerprints to effectively distinguish these embeddings. Extensive experiments across diverse datasets, languages, and embedding models confirmed QAEncoder's alignment capability, which offers a simple-yet-effective solution with zero additional index storage, retrieval latency, training costs, or catastrophic forgetting and hallucination issues. The repository is publicly available at https://github.com/IAAR-Shanghai/QAEncoder.
Zhengren Wang、Qinhan Yu、Shida Wei、Zhiyu Li、Feiyu Xiong、Xiaoxing Wang、Simin Niu、Hao Liang、Wentao Zhang
计算技术、计算机技术
Zhengren Wang,Qinhan Yu,Shida Wei,Zhiyu Li,Feiyu Xiong,Xiaoxing Wang,Simin Niu,Hao Liang,Wentao Zhang.QAEncoder: Towards Aligned Representation Learning in Question Answering Systems[EB/OL].(2025-07-02)[2025-07-16].https://arxiv.org/abs/2409.20434.点此复制
评论