首页|Public speech recognition transcripts as a configuring parameter

Public speech recognition transcripts as a configuring parameter

来源：

英文摘要

Displaying a written transcript of what a human said (i.e. producing an "automatic speech recognition transcript") is a common feature for smartphone vocal assistants: the utterance produced by a human speaker (e.g. a question) is displayed on the screen while it is being verbally responded to by the vocal assistant. Although very rarely, this feature also exists on some "social" robots which transcribe human interactants' speech on a screen or a tablet. We argue that this informational configuration is pragmatically consequential on the interaction, both for human participants and for the embodied conversational agent. Based on a corpus of co-present interactions with a humanoid robot, we attempt to show that this transcript is a contextual feature which can heavily impact the actions ascribed by humans to the robot: that is, the way in which humans respond to the robot's behavior as constituting a specific type of action (rather than another) and as constituting an adequate response to their own previous turn.

作者：Damien Rudaz、Christian Licoppe

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Damien Rudaz,Christian Licoppe.Public speech recognition transcripts as a configuring parameter[EB/OL].(2025-04-06)[2025-06-10].https://arxiv.org/abs/2504.04488.点此复制

Public speech recognition transcripts as a configuring parameter

Public speech recognition transcripts as a configuring parameter

评论