It's also possible that OpenAI use many human-generated similar-to-ARC data to train (semi-cheating). OpenAI has enough incentive to fake high score.
Without fully disclosing training data you will never be sure whether good performance comes from memorization or "semi-memorization".