Towards General-Purpose Data Discovery: A Programming Languages Approach
Towards General-Purpose Data Discovery: A Programming Languages Approach
Efficient and effective data discovery is critical for many modern applications in machine learning and data science. One major bottleneck to the development of a general-purpose data discovery tool is the absence of an expressive formal language, and corresponding implementation, for characterizing and solving generic discovery queries. To this end, we present TQL, a domain-specific language for data discovery well-designed to leverage and exploit the results of programming languages research in both its syntax and semantics. In this paper, we fully and formally characterize the core language through an algebraic model, Imperative Relational Algebra with Types (ImpRAT), and implement a modular proof-of-concept system prototype.
Andrew Kang、Yashnil Saha、Sainyam Galhotra
计算技术、计算机技术
Andrew Kang,Yashnil Saha,Sainyam Galhotra.Towards General-Purpose Data Discovery: A Programming Languages Approach[EB/OL].(2025-08-11)[2025-08-24].https://arxiv.org/abs/2508.08074.点此复制
评论