So it's an ever-shifting goalpost, which makes it pretty useless as an objective.
Yeah, but there's attempts to fix it. The Cholet paper (https://arxiv.org/abs/1911.01547) is a good attempt, shifting from measuring ~skill to measuring ~acquiring skill. It's the framework behind ARC-AGI benchmarks.
Yeah, but there's attempts to fix it. The Cholet paper (https://arxiv.org/abs/1911.01547) is a good attempt, shifting from measuring ~skill to measuring ~acquiring skill. It's the framework behind ARC-AGI benchmarks.