logoalt Hacker News

calibasyesterday at 5:52 PM4 repliesview on HN

It's a non-deterministic language model, shouldn't we expect mediocre performance in math? It seems like the wrong tool for the job...


Replies

ricticyesterday at 6:19 PM

Models are deterministic, they're a mathematical function from sequences of tokens to probability distributions over the next token.

Then a system samples from that distribution, typically with randomness, and there are some optimizations in running them that introduce randomness, but it's important to understand that the models themselves are not random.

show 2 replies
currymjtoday at 12:18 AM

thanks to training data + this being a popular benchmark, they're pretty good at grinding through symbolic mathematical derivations, which is often useful if you want an explanation of a mathematical concept. there's not really a better tool for this job, except for "a textbook which answers the exact question you have".

but from time to time, doing this does require doing arithmetic correctly (to correctly add two exponents or whatever). so it would be nice to be able to trust that.

i imagine there are other uses for basic arithmetic too, QA applications over data that quotes statistics and such.

show 1 reply
drdecayesterday at 6:03 PM

Deterministic is a special case of not-necessarily-deterministic.

CamperBob2yesterday at 6:09 PM

We passed 'mediocre' a long time ago, but yes, it would be surprising if the same vocabulary representation is optimal for both verbal language and mathematical reasoning and computing.

To the extent we've already found that to be the case, it's perhaps the weirdest part of this whole "paradigm shift."