logoalt Hacker News

saberienceyesterday at 7:58 PM1 replyview on HN

Have you actually used LLMs for non trivial tasks? They are still incredibly bad when it comes to actually hard engineering work and they still lie all the time, it's just gotten harder to notice, especially if you're just letting it run all night and generate reams of crap.

Most people are optimizing for terrible benchmarks and then don't really understand what the model did anyone and just assume it did something good. It's the blind leading the blind basically, and a lot of people with an AI-psychosis or delusion.


Replies

nfgyesterday at 8:05 PM

Do you realise who you’re replying to?

show 3 replies