|国家预印本平台
首页|Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact

Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact

Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact

来源:Arxiv_logoArxiv
英文摘要

We investigate the phenomenon of neuron universality in independently trained GPT-2 Small models, examining how these universal neurons-neurons with consistently correlated activations across models-emerge and evolve throughout training. By analyzing five GPT-2 models at three checkpoints (100k, 200k, 300k steps), we identify universal neurons through pairwise correlation analysis of activations over a dataset of 5 million tokens. Ablation experiments reveal significant functional impacts of universal neurons on model predictions, measured via loss and KL divergence. Additionally, we quantify neuron persistence, demonstrating high stability of universal neurons across training checkpoints, particularly in deeper layers. These findings suggest stable and universal representational structures emerge during neural network training.

Advey Nandan、Cheng-Ting Chou、Amrit Kurakula、Cole Blondin、Kevin Zhu、Vasu Sharma、Sean O'Brien

计算技术、计算机技术

Advey Nandan,Cheng-Ting Chou,Amrit Kurakula,Cole Blondin,Kevin Zhu,Vasu Sharma,Sean O'Brien.Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact[EB/OL].(2025-07-28)[2025-08-19].https://arxiv.org/abs/2508.00903.点此复制

评论