UFT: Unifying Supervised and Reinforcement Fine-Tuning
Asuman Ozdaglar Mingyang Liu Gabriele Farina
作者信息
引用本文复制引用
Asuman Ozdaglar,Mingyang Liu,Gabriele Farina.UFT: Unifying Supervised and Reinforcement Fine-Tuning[EB/OL].(2025-10-19)[2025-12-13].https://arxiv.org/abs/2505.16984.学科分类
计算技术、计算机技术
评论