|国家预印本平台
首页|Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

来源:Arxiv_logoArxiv
英文摘要

Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned.

Kaixuan Zhang、C. Lee Giles、Qinglong Wang、Xue Liu

数学信息科学、信息技术计算技术、计算机技术

Kaixuan Zhang,C. Lee Giles,Qinglong Wang,Xue Liu.Shapley Homology: Topological Analysis of Sample Influence for Neural Networks[EB/OL].(2019-10-14)[2025-08-21].https://arxiv.org/abs/1910.06509.点此复制

评论