Cool that we are at a stage where it is meaningful to start measuring progress toward AGI. Something I am wondering on the philosophical side: are we ever going to be able to tell if the system really "understands" and "perceives" the world?
I think the accomplishment of difficult real-world tasks requires that it does so. But I hope that we're able to reach a level of introspection to produce a satisfactory answer (and avoid doomsday), but I think that requires a more educated question. The premise of conciousness as we understand it now could be misleading.
In the same way that studying alien life would reveal more about how life in general canonicially forms and exists. Studying this artificial intellegence could unlock a new understanding of our own minds.
We'll get as close as we can with anything else, like trying to decide if a given human really "understands" and "perceives" the world.