> Can it solve easy problems yet? Weirdly, I think that's an important milestone.
Easy for who? Some problems are better solved in one way compared to another.
In the case of counting letters and such, it is not a easy problem, because of how the LLM tokenizes their input/outputs. On the other hand, it's really simple problem for any programming/scripting language, or humans.
And then you have problems like "5142352 * 51234" which is trivial problems for any basic calculator, but very hard for a human or a LLM.
Or "problems" like "Make a list of all the cities that had celebrity from there who knows how to program in Fortan", would be a "easy" problem for a LLM, but pretty much a hard problem anything else than Wikidata, assuming both LLM/Wikidata have data about it in their datasets.
> I suspect the breakthrough won't be trivial that enables solving trivial questions.
So with what I wrote above in mind, LLMs already solve trivial problems, assuming you think about the capabilities of the LLM. Of course, if you meant "trivial for humans", I'll expect the answer to always remain "No", because things like "Standing up" is trivial for humans, but it'll never be trivial for a LLM, it doesn't have any legs!
> And then you have problems like "5142352 * 51234" which is trivial problems for any basic calculator, but very hard for a human or a LLM.
I think LLMs are getting better (well better trained) on dealing with basic math questions but you still need to help them. For example, if you just ask it them to calculate the value, none of them gets it right.
http://beta.gitsense.com/?chat=876f4ee5-b37b-4c40-8038-de38b...
However, if you ask them to break down the multiplication to make it easier, three got it right.
http://beta.gitsense.com/?chat=ef1951dc-95c0-408a-aac8-f1db9...
> Easy for who?
Consider things from a different angle.
The hype men promoting the latest LLMs say the newest models produce PhD-level performance across a broad suite of benchmarks; some have even claimed that ChatGPT 4 is an early version of an AGI system that could become super-intelligent.
So the advertising teams have set the bar very high indeed. As smart as the smartest humans around, maybe smarter.
The bar they have set for themselves doesn't allow for any "oh but the tokenisation" excuses.
Not gonna lie ... wasnt expecting a correct answer... The thought process and confirmation of the calculation were LONG and actually quite amazing to watch it deduce and then calculate in different ways to confirm
The product of 5,142,352 and 51,234 is calculated as follows:
1. Break down the multiplication using the distributive property: - (5,142,352 times 51,234 = (5,000,000 + 142,352) times (50,000 + 1,234))
2. Expand and compute each part: - (5,000,000 times 50,000 = 250,000,000,000) - (5,000,000 times 1,234 = 6,170,000,000) - (142,352 times 50,000 = 7,117,600,000) - (142,352 times 1,234 = 175,662,368)
3. Sum all parts: - (250,000,000,000 + 6,170,000,000 = 256,170,000,000) - (256,170,000,000 + 7,117,600,000 = 263,287,600,000) - (263,287,600,000 + 175,662,368 = 263,463,262,368)
Final Answer: 263463262368