|国家预印本平台
首页|Unifying Global and Near-Context Biasing in a Single Trie Pass

Unifying Global and Near-Context Biasing in a Single Trie Pass

Unifying Global and Near-Context Biasing in a Single Trie Pass

来源:Arxiv_logoArxiv
英文摘要

Despite the success of end-to-end automatic speech recognition (ASR) models, challenges persist in recognizing rare, out-of-vocabulary words - including named entities (NE) - and in adapting to new domains using only text data. This work presents a practical approach to address these challenges through an unexplored combination of an NE bias list and a word-level n-gram language model (LM). This solution balances simplicity and effectiveness, improving entities' recognition while maintaining or even enhancing overall ASR performance. We efficiently integrate this enriched biasing method into a transducer-based ASR system, enabling context adaptation with almost no computational overhead. We present our results on three datasets spanning four languages and compare them to state-of-the-art biasing strategies. We demonstrate that the proposed combination of keyword biasing and n-gram LM improves entity recognition by up to 32% relative and reduces overall WER by up to a 12% relative.

Srikanth Madikeri、Iuliia Thorbecke、Karthik Pandia、Sergio Burdisso、Kadri Hacioğlu、Andreas Stolcke、Andrés Carofilis、Juan Zuluaga-Gomez、Shashi Kumar、Pradeep Rangappa、Petr Motlicek、Esaú Villatoro-Tello

计算技术、计算机技术

Srikanth Madikeri,Iuliia Thorbecke,Karthik Pandia,Sergio Burdisso,Kadri Hacioğlu,Andreas Stolcke,Andrés Carofilis,Juan Zuluaga-Gomez,Shashi Kumar,Pradeep Rangappa,Petr Motlicek,Esaú Villatoro-Tello.Unifying Global and Near-Context Biasing in a Single Trie Pass[EB/OL].(2025-07-02)[2025-07-16].https://arxiv.org/abs/2409.13514.点此复制

评论