|国家预印本平台
首页|Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data

Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data

Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data

来源:Arxiv_logoArxiv
英文摘要

Generative models have gained significant attention for their ability to produce realistic synthetic data that supplements the quantity of real-world datasets. While recent studies show performance improvements in wireless sensing tasks by incorporating all synthetic data into training sets, the quality of synthetic data remains unpredictable and the resulting performance gains are not guaranteed. To address this gap, we propose tractable and generalizable metrics to quantify quality attributes of synthetic data - affinity and diversity. Our assessment reveals prevalent affinity limitation in current wireless synthetic data, leading to mislabeled data and degraded task performance. We attribute the quality limitation to generative models' lack of awareness of untrained conditions and domain-specific processing. To mitigate these issues, we introduce SynCheck, a quality-guided synthetic data utilization scheme that refines synthetic data quality during task model training. Our evaluation demonstrates that SynCheck consistently outperforms quality-oblivious utilization of synthetic data, and achieves 4.3% performance improvement even when the previous utilization degrades performance by 13.4%.

Chen Gong、Bo Liang、Wei Gao、Chenren Xu

无线通信无线电设备、电信设备

Chen Gong,Bo Liang,Wei Gao,Chenren Xu.Data Can Speak for Itself: Quality-guided Utilization of Wireless Synthetic Data[EB/OL].(2025-06-29)[2025-07-21].https://arxiv.org/abs/2506.23174.点此复制

评论