logoalt Hacker News

emp17344yesterday at 5:06 PM3 repliesview on HN

Any chance you’re just learning more about what the model is and is not useful for?


Replies

data-ottawayesterday at 6:36 PM

There are some days where it acts staggeringly bad, beyond baselines.

But it’s impossible to actually determine if it’s model variance, polluted context (if I scold it, is it now closer in latent space to a bad worker, and performs worse?), system prompt and tool changes, fine tunes and AB tests, variances in top P selection…

There’s too many variables and no hard evidence shared by Anthropic.

jerfyesterday at 5:23 PM

I dunno about everyone else but when I learn more about what a model is and is not useful for, my subjective experience improves, not degrades.

show 1 reply
acuozzoyesterday at 6:47 PM

No because switching to the API with the same prompt immediately fixes it.

There's little incentive to throttle the API. It's $/token.