logoalt Hacker News

cousinbryceyesterday at 5:31 PM1 replyview on HN

I would guess they are trying to maximize training data


Replies

Zakyesterday at 5:47 PM

If I was being rewarded for using more tokens, I would feed LLM output back into the model. That's probably not very useful training data.