首页|Synthetic Tabular Data: Methods, Attacks and Defenses

Synthetic Tabular Data: Methods, Attacks and Defenses

来源：

英文摘要

Synthetic data is often positioned as a solution to replace sensitive fixed-size datasets with a source of unlimited matching data, freed from privacy concerns. There has been much progress in synthetic data generation over the last decade, leveraging corresponding advances in machine learning and data analytics. In this survey, we cover the key developments and the main concepts in tabular synthetic data generation, including paradigms based on probabilistic graphical models and on deep learning. We provide background and motivation, before giving a technical deep-dive into the methodologies. We also address the limitations of synthetic data, by studying attacks that seek to retrieve information about the original sensitive data. Finally, we present extensions and open problems in this area.

作者：Graham Cormode、Samuel Maddock、Enayat Ullah、Shripad Gade

作者单位：

DOI：10.1145/3711896.3736562

学科分类：计算技术、计算机技术

推荐引用：Graham Cormode,Samuel Maddock,Enayat Ullah,Shripad Gade.Synthetic Tabular Data: Methods, Attacks and Defenses[EB/OL].(2025-06-06)[2025-07-02].https://arxiv.org/abs/2506.06108.点此复制

Synthetic Tabular Data: Methods, Attacks and Defenses

Synthetic Tabular Data: Methods, Attacks and Defenses

评论