How do you know how well OpenAI's unreleased experimental model does on biology or history ques...

yorwba • today at 2:31 PM • 1 reply • view on HN

How do you know how well OpenAI's unreleased experimental model does on biology or history questions?

Replies

Sam specifically says it is general purpose and also this

> Typically for these AI results, like in Go/Dota/Poker/Diplomacy, researchers spend years making an AI that masters one narrow domain and does little else. But this isn’t an IMO-specific model. It’s a reasoning LLM that incorporates new experimental general-purpose techniques.

https://x.com/polynoamial/status/1946478250974200272

➕ show 1 reply

alt Hacker News

Replies