|国家预印本平台
首页|Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

来源:Arxiv_logoArxiv
英文摘要

Perceived voice likability plays a crucial role in various social interactions, such as partner selection and advertising. A system that provides reference likable voice samples tailored to target audiences would enable users to adjust their speaking style and voice quality, facilitating smoother communication. To this end, we propose a voice conversion method that controls the likability of input speech while preserving both speaker identity and linguistic content. To improve training data scalability, we train a likability predictor on an existing voice likability dataset and employ it to automatically annotate a large speech synthesis corpus with likability ratings. Experimental evaluations reveal a significant correlation between the predictor's outputs and human-provided likability ratings. Subjective and objective evaluations further demonstrate that the proposed approach effectively controls voice likability while preserving both speaker identity and linguistic content.

Hitoshi Suda、Shinnosuke Takamichi、Satoru Fukayama

自动化技术、自动化技术设备计算技术、计算机技术通信

Hitoshi Suda,Shinnosuke Takamichi,Satoru Fukayama.Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora[EB/OL].(2025-07-02)[2025-07-16].https://arxiv.org/abs/2507.01356.点此复制

评论