logoalt Hacker News

keedayesterday at 7:31 PM0 repliesview on HN

Relevant Pragmatic Engineer newsletter with many more cases along these lines, along with how some people are handling them: https://newsletter.pragmaticengineer.com/p/the-pulse-token-s...

Tokenmaxxing seems more and more like a way to encourage experimentation and learning, and incidents like this are a part of learning. Like, today devs simply use the most expensive model by default, even to do extremely simple things. This is obviously wasteful and costly, and budgets will soon be imposed, but this is how they're figuring out the economics.

For instance, like we estimate story points, we may estimate token budgets. At that point, why waste time and money invoking a model for a simple refactor when you could do it with a few keystrokes in an IDE? And why use a frontier model when an open-source local model could spit out that throwaway script? Local models can be tokenmaxxed, but frontier models will still be needed and will be used judiciously. Those are essentially trade-offs, and will eventually be empirically driven, which is what engineering is largely about.

So economics will soon push engineers back to do what they're paid to do: engineering. Just that it will look very different compared to what we're used to.