I'm not sure i understand the wild hype here in this thread then. Seems exactly like the test...

MyFirstSass • yesterday at 12:36 AM • 9 replies • view on HN

I'm not sure i understand the wild hype here in this thread then.

Seems exactly like the tests at my company where even frontier models are revealed to be very expensive rubber ducks, but completely fails with non experts or anything novel or math heavy.

Ie. they mirror the intellect of the user but give you big dopamine hits that'll lead you astray.

Replies

markusde • yesterday at 12:42 AM

Yes, the contributions of the people promoting the AI should be considered, as well as the people who designed the Lean libraries used in-the-loop while the AI was writing the solution. Any talk of "AGI" is, as always, ridiculous.

But speaking as a specialist in theorem proving, this result is pretty impressive! It would have likely taken me a lot longer to formalize this result even if it was in my area of specialty.

➕ show 1 reply

jacquesm • yesterday at 4:08 AM

This accurately mirrors my experience. It never - so far - has happened that the AI brought any novel insight at the level that I would see as an original idea. Presumably the case of TFA is different but the normal interaction is that that the solution to whatever you are trying to solve is a millimeter away from your understanding and the AI won't bridge that gap until you do it yourself and then it will usually prove to you that was obvious. If it was so obvious then it probably should have made the suggestion...

Recent case:

I have a bar with a number of weights supported on either end:

|---+-+-//-+-+---|

What order and/or arrangement or of removing the weights would cause the least shift in center-of-mass? There is a non-obvious trick that you can pull here to reduce the shift considerably and I was curious if the AI would spot it or not but even after lots of prompting it just circled around the obvious solutions rather than to make a leap outside of that box and come up with a solution that is better in every case.

I wonder what the cause of that kind of blindness is.

➕ show 3 replies

krzat • yesterday at 7:26 AM

In other words, LLMs work best when *you are absolutely right" and "this is a very insightful question" are actually true.

encyclopedism • yesterday at 2:16 PM

Lots of users seem to think LLM's think and reason so this sounds wonderful. A mechanical process isn't thinking, certainly it does NOT mirror human thinking. The processes being altogether different.

Davidzheng • yesterday at 12:39 AM

The proof is ai generated?

➕ show 1 reply

EA-3167 • yesterday at 7:39 PM

Do you have any idea how many people here have paychecks that depend on the hype, or hope to be in that position? They were the same way for Crypto until it stopped being part of the get-rich-quick dream.

SecretDreams • yesterday at 1:52 AM

> Ie. they mirror the intellect of the user but give you big dopamine hits that'll lead you astray.

This hits so true to home. Just today in my field a manager without expertise in a topic gave me an AI solution to something I am an expertise in. The AI was very plainly and painfully wrong, but it comes down to the user prompting really poorly. When I gave a el formulated prompt to the same topic, I got the correct answer on the first go.

HDThoreaun • yesterday at 1:47 AM

"the more interesting capability revealed by these events is the ability to rapidly write and rewrite new versions of a text as needed, even if one was not the original author of the argument." From the Tao thread. The ability to quickly iterate on research is a big change because "This is sharp contrast to existing practice where....large-scale reworking of the paper often avoided due both to the work required and the large possibility of introducing new errors."

anthem2025 • yesterday at 4:01 AM

[dead]

alt Hacker News

Replies