logoalt Hacker News

btownyesterday at 7:14 PM1 replyview on HN

In theory reasoning tokens should do the equivalent of this - explicitly create options outside of the quick-response probability space, so those can guide future generation.

In practice, models that do this won't be prioritized as much, because the economics of thinking tokens that stop by default at, say, one option plus a bit more planning (short of full alternatives) would be superior as long as billing is per-user instead of per-token. So we'll still need to play games with prompting!


Replies

tliltocatlyesterday at 7:17 PM

Without continuous feedback from real world, lower-probability token (and soon high-probability ones as well) will be complete garbage.