Training People to Reward Robots
Training People to Reward Robots
Learning from demonstration (LfD) is a technique that allows expert teachers to teach task-oriented skills to robotic systems. However, the most effective way of guiding novice teachers to approach expert-level demonstrations quantitatively for specific teaching tasks remains an open question. To this end, this paper investigates the use of machine teaching (MT) to guide novice teachers to improve their teaching skills based on reinforcement learning from demonstration (RLfD). The paper reports an experiment in which novices receive MT-derived guidance to train their ability to teach a given motor skill with only 8 demonstrations and generalise this to previously unseen ones. Results indicate that the MT-guidance not only enhances robot learning performance by 89% on the training skill but also causes a 70% improvement in robot learning performance on skills not seen by subjects during training. These findings highlight the effectiveness of MT-guidance in upskilling human teaching behaviours, ultimately improving demonstration quality in RLfD.
Yuqing Zhu、Matthew Howard、Endong Sun
自动化技术、自动化技术设备计算技术、计算机技术教育
Yuqing Zhu,Matthew Howard,Endong Sun.Training People to Reward Robots[EB/OL].(2025-05-15)[2025-06-15].https://arxiv.org/abs/2505.10151.点此复制
评论