|国家预印本平台
首页|TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

来源:Arxiv_logoArxiv
英文摘要

High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present TAG-INSTRUCT, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, TAG-INSTRUCT compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that TAG-INSTRUCT outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.

He Zhu、Zhiwen Ruan、Junyou Su、Xingwei He、Yun Chen、Wenjia Zhang、Guanhua Chen

语言学计算技术、计算机技术

He Zhu,Zhiwen Ruan,Junyou Su,Xingwei He,Yun Chen,Wenjia Zhang,Guanhua Chen.TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation[EB/OL].(2025-05-24)[2025-07-09].https://arxiv.org/abs/2505.18557.点此复制

评论