|国家预印本平台
首页|Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech

Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech

Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech

来源:Arxiv_logoArxiv
英文摘要

In this paper, we propose a novel framework to control voice style in prompt-based, controllable text-to-speech systems by leveraging textual personas as voice style prompts. We present two persona rewriting strategies to transform generic persona descriptions into speech-oriented prompts, enabling fine-grained manipulation of prosodic attributes such as pitch, emotion, and speaking rate. Experimental results demonstrate that our methods enhance the naturalness, clarity, and consistency of synthesized speech. Finally, we analyze implicit social biases introduced by LLM-based rewriting, with a focus on gender. We underscore voice style as a crucial factor for persona-driven AI dialogue systems.

Yejin Lee、Jaehoon Kang、Kyuhong Shim

计算技术、计算机技术

Yejin Lee,Jaehoon Kang,Kyuhong Shim.Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech[EB/OL].(2025-05-20)[2025-06-30].https://arxiv.org/abs/2505.17093.点此复制

评论