TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure
TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure
Hierarchical planning is a powerful approach to model long sequences structurally. Aside from considering hierarchies in the temporal structure of music, this paper explores an even more important aspect: concept hierarchy, which involves generating music ideas, transforming them, and ultimately organizing them--across musical time and space--into a complete composition. To this end, we introduce TOMI (Transforming and Organizing Music Ideas) as a novel approach in deep music generation and develop a TOMI-based model via instruction-tuned foundation LLM. Formally, we represent a multi-track composition process via a sparse, four-dimensional space characterized by clips (short audio or MIDI segments), sections (temporal positions), tracks (instrument layers), and transformations (elaboration methods). Our model is capable of generating multi-track electronic music with full-song structure, and we further integrate the TOMI-based model with the REAPER digital audio workstation, enabling interactive human-AI co-creation. Experimental results demonstrate that our approach produces higher-quality electronic music with stronger structural coherence compared to baselines.
Qi He、Gus Xia、Ziyu Wang
科学、科学研究
Qi He,Gus Xia,Ziyu Wang.TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure[EB/OL].(2025-06-29)[2025-07-16].https://arxiv.org/abs/2506.23094.点此复制
评论