首页|InfiniteAudio: Infinite-Length Audio Generation with Consistency

InfiniteAudio: Infinite-Length Audio Generation with Consistency

来源：

英文摘要

This paper presents InfiniteAudio, a simple yet effective strategy for generating infinite-length audio using diffusion-based text-to-audio methods. Current approaches face memory constraints because the output size increases with input length, making long duration generation challenging. A common workaround is to concatenate short audio segments, but this often leads to inconsistencies due to the lack of shared temporal context. To address this, InfiniteAudio integrates seamlessly into existing pipelines without additional training. It introduces two key techniques: FIFO sampling, a first-in, first-out inference strategy with fixed-size inputs, and curved denoising, which selectively prioritizes key diffusion steps for efficiency. Experiments show that InfiniteAudio achieves comparable or superior performance across all metrics. Audio samples are available on our project page.

作者：Chaeyoung Jung、Hojoon Ki、Ji-Hoon Kim、Junmo Kim、Joon Son Chung

作者单位：

学科分类：通信无线通信

推荐引用：Chaeyoung Jung,Hojoon Ki,Ji-Hoon Kim,Junmo Kim,Joon Son Chung.InfiniteAudio: Infinite-Length Audio Generation with Consistency[EB/OL].(2025-06-03)[2025-07-09].https://arxiv.org/abs/2506.03020.点此复制

InfiniteAudio: Infinite-Length Audio Generation with Consistency

InfiniteAudio: Infinite-Length Audio Generation with Consistency

评论