|国家预印本平台
首页|Interactive Text-to-SQL via Expected Information Gain for Disambiguation

Interactive Text-to-SQL via Expected Information Gain for Disambiguation

Interactive Text-to-SQL via Expected Information Gain for Disambiguation

来源:Arxiv_logoArxiv
英文摘要

Relational databases are foundational to numerous domains, including business intelligence, scientific research, and enterprise systems. However, accessing and analyzing structured data often requires proficiency in SQL, which is a skill that many end users lack. With the development of Natural Language Processing (NLP) technology, the Text-to-SQL systems attempt to bridge this gap by translating natural language questions into executable SQL queries via an automated algorithm. Yet, when operating on complex real-world databases, the Text-to-SQL systems often suffer from ambiguity due to natural ambiguity in natural language queries. These ambiguities pose a significant challenge for existing Text-to-SQL translation systems, which tend to commit early to a potentially incorrect interpretation. To address this, we propose an interactive Text-to-SQL framework that models SQL generation as a probabilistic reasoning process over multiple candidate queries. Rather than producing a single deterministic output, our system maintains a distribution over possible SQL outputs and seeks to resolve uncertainty through user interaction. At each interaction step, the system selects a branching decision and formulates a clarification question aimed at disambiguating that aspect of the query. Crucially, we adopt a principled decision criterion based on Expected Information Gain to identify the clarification that will, in expectation, most reduce the uncertainty in the SQL distribution.

Luyu Qiu、Jianing Li、Chi Su、Lei Chen

计算技术、计算机技术

Luyu Qiu,Jianing Li,Chi Su,Lei Chen.Interactive Text-to-SQL via Expected Information Gain for Disambiguation[EB/OL].(2025-07-09)[2025-07-16].https://arxiv.org/abs/2507.06467.点此复制

评论