See peer reply re: yes, your self-chosen benchmark has been reached.
Generally, I've learned to warn myself off of a take when I start writing emotionally charged stuff like [1]. Without any prompting (who mentioned apps? and why would you without checking?), also, when reading minds, and assigning weak arguments, now and in my imagination of the future. [2]
At the very least, [2] is a signal to let the keyboard have a rest, and ideally my mind.
Bailey: > "If [there were] new LLMs...consistently solving Erdos problems at rapidly increasing rates then they'd be showing...that"
Motte: > "I can['t] pop into ChatGPT and pop out Erdos proofs regularly"
No less than Terence Tao, a month ago, pointing out your bailey was newly happening with the latest generation: https://mathstodon.xyz/@tao/115788262274999408. Not sure how you only saw one Erdos problem.
[1] "I'll wait with bated breath for the millions of amazing apps which couldn't be coded before to start showing up"
[2] "...or, more likely, be told in 6 months how these 2 benchmarks weren't the ones that should matter either"