ARC Prize 2024: Technical Report
Gregory Kamradt Mike Knoop Francois Chollet Bryan Landers
作者信息
Abstract
As of December 2024, the ARC-AGI benchmark is five years old and remains
unbeaten. We believe it is currently the most important unsolved AI benchmark
in the world because it seeks to measure generalization on novel tasks -- the
essence of intelligence -- as opposed to skill at tasks that can be prepared
for in advance. This year, we launched ARC Prize, a global competition to
inspire new ideas and drive open progress towards AGI by reaching a target
benchmark score of 85\%. As a result, the state-of-the-art score on the ARC-AGI
private evaluation set increased from 33\% to 55.5\%, propelled by several
frontier AGI reasoning techniques including deep learning-guided program
synthesis and test-time training. In this paper, we survey top approaches,
review new open-source implementations, discuss the limitations of the
ARC-AGI-1 dataset, and share key insights gained from the competition.引用本文复制引用
Gregory Kamradt,Mike Knoop,Francois Chollet,Bryan Landers.ARC Prize 2024: Technical Report[EB/OL].(2024-12-05)[2026-05-28].https://arxiv.org/abs/2412.04604.学科分类
计算技术、计算机技术