> So maybe the AI labs have been paying attention after all!
> I think this mainly demonstrates that the pelican on the bicycle has firmly exceeded its limits as a useful benchmark.
As acknowledged in the article.
Gemini 3.1 basically takes it home on that benchmark, anyway, it's done.
Gemini 3.1 basically takes it home on that benchmark, anyway, it's done.