首页|EASY: Emotion-aware Speaker Anonymization via Factorized Distillation

EASY: Emotion-aware Speaker Anonymization via Factorized Distillation

来源：

英文摘要

Emotion plays a significant role in speech interaction, conveyed through tone, pitch, and rhythm, enabling the expression of feelings and intentions beyond words to create a more personalized experience. However, most existing speaker anonymization systems employ parallel disentanglement methods, which only separate speech into linguistic content and speaker identity, often neglecting the preservation of the original emotional state. In this study, we introduce EASY, an emotion-aware speaker anonymization framework. EASY employs a novel sequential disentanglement process to disentangle speaker identity, linguistic content, and emotional representation, modeling each speech attribute in distinct subspaces through a factorized distillation approach. By independently constraining speaker identity and emotional representation, EASY minimizes information leakage, enhancing privacy protection while preserving original linguistic content and emotional state. Experimental results on the VoicePrivacy Challenge official datasets demonstrate that our proposed approach outperforms all baseline systems, effectively protecting speaker privacy while maintaining linguistic content and emotional state.

作者：Jixun Yao、Hexin Liu、Eng Siong Chng、Lei Xie

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Jixun Yao,Hexin Liu,Eng Siong Chng,Lei Xie.EASY: Emotion-aware Speaker Anonymization via Factorized Distillation[EB/OL].(2025-05-20)[2025-06-24].https://arxiv.org/abs/2505.15004.点此复制

EASY: Emotion-aware Speaker Anonymization via Factorized Distillation

EASY: Emotion-aware Speaker Anonymization via Factorized Distillation

评论