I swear the industry is being Garry Tanned.
Senior management let go our localisation staff. Now they want us to use AI to translate. They still want manual review.
We use Github Copilot at work, we get a measly 300 requests with the budget to go over if necessary. Opus 4.7 or GPT 5.5 would eat all of those up in a day. Are we supposed to be using more than the allotted amount, do management see that as a good thing. Or is it best to stick within the allocated amount. Who knows? Management are playing games everywhere it seems.
Saw a good joke on twitter about it. Something like:
"You spent $23, over the $20 food limit. Be more careful next time. You spent $600 on tokens, $200 more than the average. Congratulations!"
I work at Amazon (standard disclaimer: just sharing my own experience, not an official spokesperson, etc.)
I can't say that this isn't happening, but at least the parts of the company I get visibility into, what the article describes isn't my experience. There is a lot of interest in using GenAI, but people are mostly getting kudos around creative uses for GenAI, not just for raw amount of tokens. For most scaled GenAI efforts, there is a lot of focus on output metrics (metrics like accuracy, number of findings, number of things fixed, and so on).
It is damn fascinating to see just how many (big, serious) organizations are creating unnecessary internal strife over this.
One of my favorite heuristics/quotes applies here: "no matter how good the strategy, occasionally consider the result."
Want to know if AI is working for your org? Ask yourself/employees to "show me the result." That requires judgment and taste (is the result something of value, or just the appearance of work having been done), but it will also save you a ton of stress and disappointment later.
“Show me the incentive and I'll show you the outcome.”
― Charlie Munger
I was thinking about this recently. I tend to run my AI at low context because the documentation states that they degrade with higher context usage.
However I see tons of people on LinkedIn with ways of backing up context, not wanting to lose context, etc.
This seems like another way the system is being misused. Higher context usage also uses more tokens. I suspect you get worse (and slower) output too than a dense detailed context.
https://en.wikipedia.org/wiki/Poe's_law I was just joking about it a few days ago (I swear I didn't know Amazon was doing this) https://news.ycombinator.com/item?id=48079533
> That’s my latest joke — that we’ll have to pretend like we used the tools so they can feel validated they’ve spent all this money on hyped up technology. So, yes, it’s em-dashes and “it’s not just this, it’s that …” so they can hopefully leave us alone
Once you have a score, you have a game. Once you have a game, people will do whatever it takes to win.
People who don't code(management, leadership) think AI will 10x the company but it's really a 40-60% boost. But engineers have to feign adopting this tools in fear of layoffs
I joked about this on HN a few weeks ago and I find it funny that we ended up here already. Goodhart's Law in action.
Amazon is big and inconsistent enough that "somewhere in Amazon, <XYZ> is occurring" is statistically true, no matter how nutso-sounding your <XYZ>.
I can tell they are surely not the only ones.
Everyone I talk to has nowadays KPIs tied to AI usage on their performance evaluation.
I have mixed thoughts on this. These thoughts are my own. On the one hand, it’s objectively silly to pretend like we’ve solved the age old problem of measuring developer productivity. Metric-obsessed leadership can also be intolerable, counterproductive, and it’s a good way to paint yourself into a corner undervaluing your best talent and overvaluing your mediocre talent.
That said, I’m kind of having a blast using CC in corporate with all the connectors available at our disposal, and I baffled how little some of my coworkers know about what’s available and what the capabilities are. So it’s clear that perhaps some encouragement is prudent for those who are slower to embrace new technologies, but I’m not sure tokencounting and tokenmaxing are the answer.
When did FT become Business Insider?
I have an FT subscription and they keep moving toward this kind of narrative first reporting to get clicks. It’s no longer a believable paper.
I, too, can easily use more tokens to achieve the same task. I can give worse prompts. I can fail to make it clear to the tools where to find the information they need. I can ask them to think hard when the don’t need to ask tell them not to think when they do need to. I can give vague, open ended instructions. I can generate code that sucks and throw it away.
If I do all of this, do I get a promotion?
I wish I could do some tokenmaxxing at my company. The only plan available is maxed out for the month after a few days of serious work, but the AI “experts” are declaring that nobody needs that much. It’s really frustrating to constantly have to juggle quota and lower models. All this while the declared goal is to reach 50% of code written by AI.
Hunger games in the age of AI - eliminate/automate your colleague's job, until a single software engineer is left (or two if aristocrats will see it as a good PR).
You can use Codex and Claude code for most of the tasks that you would manually do
Filing JIRA tickets, updates. Opening PRs, having AI review PRs. This will all use tokens.
No need to tokenmaxx, you will end up burning tokens with just regular AI usage
Each day I send the AI on a fruitless mission like "summarize the entire codebase" while I do my actual work, which involves actually using the AI for real work. Wish I could disable the token cache to make it spend more.
It's the same as measuring productivity by lines of code written, same dumb logic by management, not surprising.
At least for some people I know it’s not necessarily because there’s pressure from leadership, but because it’s funny that the org spends like $15,000/mo writing HP fanfic or whatever
This kind of thing is totally fine if it's being done (it's believable because Meta internally incentivized tokenmaxxing). When you're trying to change the behavior of a large number of people, only blunt instruments are available if you want to get quick outcomes. The edge cases where people Goodhart very hard are all right. You can just human-in-the-loop them away. The opportunity cost for most organizations of not moving to use AI tools as productivity enhancers is currently gauged by them (rightfully, in my opinion) to be too high to allow for osmotic adoption.
Most people look at sea changes come and go. They all have a story of how they "could have bought Bitcoin when it was $100" or whatever. In an org, you don't want to have the story of "we could have done that when nobody else had", so you incentivize adoption of the tool as hard as possible and hope that dipping feet in the water makes people want to swim. If you don't already have a culture of early adoption (and no large company can) then you have to use blunt incentives. I don't think anyone has demonstrated otherwise.
> They said the move reflected pressure to adopt the technology after Amazon introduced targets for more than 80 percent of developers to use AI each week, and earlier this year began tracking AI token consumption on internal leader boards.
This measuring of tokenmaxxing as a proxy for something beneficial to the company has got to be the single dumbest thing I have ever heard of in my entire software career.
It would be like some company in the dot com era measuring employee's internet download traffic as a proxy for productivity or internet-pilledness.
Why not just reward employees based on who's submit the largest expenses claims? That might have some correlation to work too, right ?!
Similar to an HFT company I know, using the money spent on tokens per developer as their efficiency metric. Insane.
Our AWS TAM has recently started to respond to us in AI-like responses. It's very obvious. Now it makes sense why.
Measuring productivity via tokens is the modern day equivalent to doing it via number of commits or LOC
Hot take:
There should be an anti leaderboard that highlight people under a threshold. Not trying to learn how to use ai while working at a company like Amazon is almost certainly a bad thing, and cause for looking into why.
Measuring token usage as a productivity metric is like measuring keystrokes. Don't mind me, just over here rolling my face on the keyboard for an hour so I can take Friday off...
...except each keystroke has an associated cost, the sum of which may equal or exceed my salary.
A perfect doomsday machine. Over-using tokens gets your peers laid-off before yourself.
Another stupid meme-latching name. Don't normalize these *maxxing nonsense words and just use plain language. Let's see, maybe just say they were optimizing for token count?
Reminds me of the managers that use 'lines of code added' as a metric
Seems to be a clear case of Goodhart's Law that states that "when a measure becomes a target, it ceases to be a good measure."
Can't you just, wire your agent into a Python script and have it infinitely check its own work? That would hit the metrics, but do nothing useful.
Hell, throw a Tarot reading in the middle of the loop so the agent has non-deterministic behavior too.
https://github.com/trailofbits/skills/tree/main/plugins/let-...
Amazon management wants to play five-dimensional chess? Play Balatro instead.
Imagine selling a product where companies are foaming at the mouth to increase their spend and pay you more money
It does not get any better than that
Jensen, Sam, Dario: https://i.imgur.com/AI7rtCY.jpeg
tokenmaxxing is silly, but if a developer or manager NEVER uses AI then I do think that's cause for concern as it shows a genuine lack of curiosity... perhaps tokenflooring makes more sense than tokenmaxxing
Vibecoded ppt, docs, frontends is an even bigger scam than crypto ever was. Ofc people getting sucked into it
Someone pressuring to do something at work gives off creep vibes.
Is that in the contract to use AI tools? If not, then what are they on about.
This makes me think of the tulip bubble. Using AI as much as possible just so people think you are productive is like buying tulips so that people think you're affluent.
This reads more like it's a single employees gripe than a real thing that's happening. They're not using the metrics in performance reviews, and it's a new AI tool that AWS probably wants legitimate usage data out of.
That said, if you can't figure out how to use AI in a software job you should look into it. Not using AI at this point is a lot like not using CAD as an architect.
A very poor look for management. They don't know what the heck they're doing.
yes
omg
[dead]
[dead]
[dead]
The fact that management signed off on measuring AI use through token usage shows how incompetent management really is, including in allegedly technical conmpanies like Amazon. Tokenmaxxing was an entirely expected and rational response. IOW You measure employees in stupid ways, you're going to get stupid behaviour as a consequence.