logoalt Hacker News

vintermannyesterday at 9:17 PM2 repliesview on HN

It's not at all obvious where to drop the context, though. Maybe it helps to have similar tasks in the context, maybe not. It did really, shockingly well on a historical HTR task I gave it, so I gave it another one, in some ways an easier one... Thought it wouldn't hurt to have text in a similar style in the context. But then it suddenly did very poorly.

Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.


Replies

dcretoday at 12:15 AM

The models you interact with through the API (as opposed to chat UIs) are held stable and let you specify reasoning effort, so if you use a client that takes API keys, you might be able to solve both of those problems.

erutoday at 2:08 AM

> Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.

That's what websites have been doing for ages. Just like you can't step twice in the same river, you can't use the same version of Google Search twice, and never could.