
AIabout 19 hours ago
ARC-AGI-3 Drops a 1% Score That Should Embarrass Every Capability Claim Made This Year
A new benchmark from the ARC Prize Foundation finds frontier AI systems score below 1% on tasks humans solve every time. The governance frameworks trying to regulate AI capability thresholds are measuring the wrong thing entirely.
By Paul MenonAI|
#Regulation#ARC-AGI#benchmarks