There's another blog post that made it to the front-page of this site which sums up the state of the art nicely [0].
It's not obvious that they will be able to do any reasoning, in the formal sense, at all; let alone better than humans. LLMs are simply not sufficient for the kinds of tasks and work done when reasoning about mathematical problems.
There's plenty of research demonstrating that they can be useful in small, constrained tasks -- which isn't anything to raise our noses at!
... it's just not _obvious_ in the sense that there is a clear step from LLM capabilities today to "better than humans." It's more an article of faith that it could be true, some day, if we just figure out X, Y, Z... which folks have been doing for decades to no avail. In other words, it's not obvious at all.
[0] https://garymarcus.substack.com/p/llms-dont-do-formal-reason...
It’s true that current models do not do formal reasoning, my point is that it is possible to use tokenization to do it. See my comment in the other thread.