A Comprehensive Study of Shapley Value in Data Analytics
A Comprehensive Study of Shapley Value in Data Analytics
Over the recent years, Shapley value (SV), a solution concept from cooperative game theory, has found numerous applications in data analytics (DA). This paper presents the first comprehensive study of SV used throughout the DA workflow, clarifying the key variables in defining DA-applicable SV and the essential functionalities that SV can provide for data scientists. We condense four primary challenges of using SV in DA, namely computation efficiency, approximation error, privacy preservation, and interpretability, disentangle the resolution techniques from existing arts in this field, then analyze and discuss the techniques w.r.t. each challenge and the potential conflicts between challenges.We also implement SVBench, a modular and extensible open-source framework for developing SV applications in different DA tasks, and conduct extensive evaluations to validate our analyses and discussions. Based on the qualitative and quantitative results, we identify the limitations of current efforts for applying SV to DA and highlight the directions of future research and engineering.
Zhongle Xie、Ke Chen、Meihui Zhang、Lidan Shou、Gang Chen、Shixin Wan、Hong Lin
计算技术、计算机技术
Zhongle Xie,Ke Chen,Meihui Zhang,Lidan Shou,Gang Chen,Shixin Wan,Hong Lin.A Comprehensive Study of Shapley Value in Data Analytics[EB/OL].(2025-07-08)[2025-08-02].https://arxiv.org/abs/2412.01460.点此复制
评论