|国家预印本平台
首页|Reinforcement learning of state representation and value: the power of random feedback and biological constraints

Reinforcement learning of state representation and value: the power of random feedback and biological constraints

Reinforcement learning of state representation and value: the power of random feedback and biological constraints

来源:bioRxiv_logobioRxiv
英文摘要

How external/internal 'state' is represented in the brain is crucial, since appropriate representation enables goal-directed behavior. Recent studies suggest that state representation and state value can be simultaneously learnt through reinforcement learning (RL) using reward-prediction-error in recurrent-neural-network (RNN) and its downstream weights. However, how such learning can be neurally implemented remains unclear because training of RNN through the 'backpropagation' method requires downstream weights, which are biologically unavailable at the upstream RNN. Here we show that training of RNN using random feedback instead of the downstream weights still works because of the 'feedback alignment', which was originally demonstrated for supervised learning. We further show that if the downstream weights and the random feedback are biologically constrained to be non-negative, learning still occurs without feedback alignment because the non-negative constraint ensures loose alignment. These results suggest neural mechanisms for RL of state representation/value and the power of random feedback and biological constraints.

Tsurumi Takayuki、Kato Ayaka、Kumar Arvind、Morita Kenji

10.1101/2024.08.22.609100

计算技术、计算机技术生物科学研究方法、生物科学研究技术

Tsurumi Takayuki,Kato Ayaka,Kumar Arvind,Morita Kenji.Reinforcement learning of state representation and value: the power of random feedback and biological constraints[EB/OL].(2025-03-28)[2025-06-25].https://www.biorxiv.org/content/10.1101/2024.08.22.609100.点此复制

评论