logoalt Hacker News

beardyw10/11/20245 repliesview on HN

I honestly can't see why LLMs should be good at this sort of thing. I am convinced you need a completely different approach. At the very least you mostly only want one completely correct result. Good luck getting current models to do that.


Replies

hackinthebochs10/11/2024

LLMs aren't totally out of scope of mathematical reasoning. LLMs roughly do two things, move data around, and recognize patterns. Reasoning leans heavily on moving data around according to context-sensitive rules. This is well within the scope of LLMs. The problem is that general problem solving requires potentially arbitrary amounts of moving data, but current LLM architectures have a fixed amount of translation/rewrite steps they can perform before they must produce output. This means most complex reasoning problems are out of bounds for LLMs so they learn to lean heavily on pattern matching. But this isn't an intrinsic limitation to LLMs as a class of computing device, just the limits of current architectures.

qudat10/11/2024

One core issue is that we need to convert spoken/written languages (e.g. english) into more formal math languages since sometimes the underlying mathematical problem is written using prose. The example in the paper:

> When Sophie watches her nephew, she gets out a variety of toys for him. The bag of building blocks has 31 blocks in it. The bin of stuffed animals has 8 stuffed animals inside. The tower of stacking rings has 9 multicolored rings on it. Sophie recently bought a tube of bouncy balls, bringing her total number of toys for her nephew up to 62. How many bouncy balls came in the tube?

So I would argue it's critical that LLMs knows how to convert text to math and then perform those math calculations. This extends beyond just math but also the underlying logics.

We just need to figure out how to inform the LLM to read, write, and understand formal languages. My guess is attention heads could probably work in this context, but we might want something that is a little more rigid, naturally extending from the rigidity of logic and formal languages. Conversely, we might not have figured out how to properly train LLMs on formal languages and have them preserve the underlying logic and axioms necessary to correctly perform math calculations.

s-macke10/11/2024

Well, my perspective on this is as follows:

The recurrent or transformer models are Turing complete, or at least close to being Turing complete (apologies, I’m not sure of the precise terminology here).

As a result, they can at least simulate a brain and are capable of exhibiting human-like intelligence. The "program" is the trained dataset, and we have seen significant improvements in smaller models simply by enhancing the dataset.

We still don’t know what the optimal "program" looks like or what level of scaling is truly necessary. But in theory, achieving the goal of AGI with LLMs is possible.

golol10/11/2024

I'm a math phd student at the moment and I regularly use o1 to try some quick calculations I don't feel like doing. While I feel like GPT-4o is so distilled that it just tries to know the answer from memory, o1 actually works with what you gave it and tries to calculate. It's can be quite useful.

show 1 reply