|国家预印本平台
| 注册
首页|Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

Shuyao Xu Cheng Peng Jiangxuan Long Weidi Xu Wei Chu Yuan Qi

Arxiv_logoArxiv

Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

Shuyao Xu Cheng Peng Jiangxuan Long Weidi Xu Wei Chu Yuan Qi

作者信息

引用本文复制引用

Shuyao Xu,Cheng Peng,Jiangxuan Long,Weidi Xu,Wei Chu,Yuan Qi.Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning[EB/OL].(2025-05-30)[2025-12-13].https://arxiv.org/abs/2505.24850.

学科分类

计算技术、计算机技术

评论

首发时间 2025-05-30
下载量:0
|
点击量:2
段落导航相关论文