alt
Hacker News
fulafel
•
yesterday at 6:27 PM
•
0 replies
•
view on HN
So GDPval is OpenAI's own benchmark. PDF link:
https://arxiv.org/pdf/2510.04374