logoalt Hacker News

Zakyesterday at 5:47 PM1 replyview on HN

If I was being rewarded for using more tokens, I would feed LLM output back into the model. That's probably not very useful training data.


Replies

piva00yesterday at 7:30 PM

I personally know two people who are doing exactly that after a mandate rolled out at their work, the measurement is "tokens spent" and since they weren't finding many cases that required a lot of tokens they simply started to run agent loops feeding each other.

Absurdly wasteful but Goodhart's Law almost never fails.