Assuming decent data, it won't be stochastic sampling for many math operations/input combi...

danielmarkbruce • last Tuesday at 10:28 PM • 1 reply • view on HN

Assuming decent data, it won't be stochastic sampling for many math operations/input combinations. When people suggest LLMs with tokenization could learn math, they aren't suggesting a small undertrained model trained on crappy data.

Replies

anonymoushn • last Tuesday at 11:41 PM

I mean, this depends on your sampler. With temp=1 and sampling from the raw output distribution, setting aside numerics issues, these models output nonzero probability of every token at each position

➕ show 1 reply

alt Hacker News

Replies