首页|The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

来源：

英文摘要

This paper studies the emergence of interpretable categorical features within large language models (LLMs), analyzing their behavior across training checkpoints (time), transformer layers (space), and varying model sizes (scale). Using sparse autoencoders for mechanistic interpretability, we identify when and where specific semantic concepts emerge within neural activations. Results indicate clear temporal and scale-specific thresholds for feature emergence across multiple domains. Notably, spatial analysis reveals unexpected semantic reactivation, with early-layer features re-emerging at later layers, challenging standard assumptions about representational dynamics in transformer models.

作者：Shashata Sawmya、Micah Adler、Nir Shavit

作者单位：

学科分类：计算技术、计算机技术

推荐引用：Shashata Sawmya,Micah Adler,Nir Shavit.The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models[EB/OL].(2025-05-25)[2025-06-22].https://arxiv.org/abs/2505.19440.点此复制

The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

The Birth of Knowledge: Emergent Features across Time, Space, and Scale in Large Language Models

评论