首页|Fairness is in the details : Face Dataset Auditing

Fairness is in the details : Face Dataset Auditing

来源：

英文摘要

Auditing involves verifying the proper implementation of a given policy. As such, auditing is essential for ensuring compliance with the principles of fairness, equity, and transparency mandated by the European Union's AI Act. Moreover, biases present during the training phase of a learning system can persist in the modeling process and result in discrimination against certain subgroups of individuals when the model is deployed in production. Assessing bias in image datasets is a particularly complex task, as it first requires a feature extraction step, then to consider the extraction's quality in the statistical tests. This paper proposes a robust methodology for auditing image datasets based on so-called "sensitive" features, such as gender, age, and ethnicity. The proposed methodology consists of both a feature extraction phase and a statistical analysis phase. The first phase introduces a novel convolutional neural network (CNN) architecture specifically designed for extracting sensitive features with a limited number of manual annotations. The second phase compares the distributions of sensitive features across subgroups using a novel statistical test that accounts for the imprecision of the feature extraction model. Our pipeline constitutes a comprehensive and fully automated methodology for dataset auditing. We illustrate our approach using two manually annotated datasets.

作者：V. Lafargue、E. Claeys、J. M. Loubes

作者单位：

学科分类：计算技术、计算机技术

推荐引用：V. Lafargue,E. Claeys,J. M. Loubes.Fairness is in the details : Face Dataset Auditing[EB/OL].(2025-04-11)[2025-07-09].https://arxiv.org/abs/2504.08396.点此复制

Fairness is in the details : Face Dataset Auditing

Fairness is in the details : Face Dataset Auditing

评论