This is why I run my own models. All the inference providers do sneaky things behind the scenes. The...

kittikitti • yesterday at 5:53 PM • 0 replies • view on HN

This is why I run my own models. All the inference providers do sneaky things behind the scenes. They will limit the output tokens, turn off attention layers, lower reasoning, or just use a completely different model. I'm actually surprised that Claude Code experienced this, as I've experienced this the least from API and coding agents.

alt Hacker News