logoalt Hacker News

Alex-Programs01/20/20250 repliesview on HN

Exactly - it works better in the real world, where there's a lot less context than a clinical benchmark, and you're just trying to get the answer without writing an essay.