There's still something off in the grading, and I suspect they worked around it
(although I get what you mean, not easily since you already trained)
I'm guessing when they get a clean slate we'll have Image 2 instead of 1.5. In LMArena it was immediately apparent it was an OpenAI model based on visuals.