logoalt Hacker News

theptipyesterday at 7:11 PM0 repliesview on HN

I believe this is going to be an increasingly important factor.

Call it the “shoelace fallacy”: Alice is supposedly much smarter but Bob can tie his shoelaces just as well.

The choice of eval, prompt scaffolding, etc. all dramatically impact the intelligence that these models exhibit. If you need a PhD to coax PhD performance from these systems, you can see why the non-expert reaction is “LLMs are dumb” / progress has stalled.