|国家预印本平台
| 注册
首页|LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions

LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions

LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions

来源:Arxiv_logoArxiv
英文摘要

Phone call transcript labeling is prohibitively expensive (approximately 2 USD per minute) due to privacy regulations, consent requirements, and manual annotation costs requiring 3 hours of expert time per hour of audio. Existing extraction methods fail on conversational speech containing disfluencies, interruptions, and speaker overlap. We introduce LingVarBench, a synthetic data generation pipeline that addresses these constraints through automated validation. First, we prompt an LLM to generate realistic structured field values across multiple use cases. Second, we recursively prompt the model to transform these values into thousands of natural conversational utterances containing typical phone call characteristics. Third, we validate each synthetic utterance by testing whether a separate LLM-based extractor can recover the original structured information. We employ DSPy's SIMBA optimizer to automatically synthesize extraction prompts from validated synthetic transcripts, eliminating manual prompt engineering. Our optimized prompts achieve up to 95 percent accuracy for numeric fields (vs. 88-89 percent zero-shot), 90 percent for names (vs. 47-79 percent), and over 80 percent for dates (vs. 72-77 percent) on real customer transcripts, demonstrating substantial gains over zero-shot prompting. The synthetic-to-real transfer demonstrates that conversational patterns learned from generated data generalize effectively to authentic phone calls containing background noise and domain-specific terminology. LingVarBench provides the first systematic benchmark for structured extraction from synthetic conversational data, demonstrating that automated prompt optimization overcomes cost and privacy barriers preventing large-scale phone call analysis in commercial settings.

Seyedali Mohammadi、Manas Paldhe、Amit Chhabra

自动化技术、自动化技术设备计算技术、计算机技术

Seyedali Mohammadi,Manas Paldhe,Amit Chhabra.LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions[EB/OL].(2025-08-13)[2025-09-05].https://arxiv.org/abs/2508.15801.点此复制

评论