logoalt Hacker News

grishkalast Saturday at 9:20 PM1 replyview on HN

Well, that's the thing — if you understand the technology you're working with and know how to verify the result, chances are, completing the same task with AI would take you longer than without it. So the whole appeal of AI seems to be to let it do things without much oversight.

The common failure mode of AI is also concerning. If you ask it to do something that can't be done trivially or at all, or wasn't present enough in the learning dataset, it often wouldn't tell you it doesn't know how to do it. Instead, it'll make shit up with utmost confidence.

Just yesterday I stumbled upon this article that closely matches my opinion: https://eev.ee/blog/2025/07/03/the-rise-of-whatever/


Replies

serbuvladlast Saturday at 9:47 PM

But that's exactly the thing. I DON'T understand the technology without AI.. I know stuff about Linux, but I knew NOTHING about Ansible, FreeIPA etc. So I guess you could say I understand the problem space not the solution space?? Either way, it would have taken us many months to do what it did take us a few weeks to with AI.

> So the whole appeal of AI seems to be to let it do things without much oversight.

No?? The whole appeal of AI for me is doing things I know how I want to look at the end but I don't know how to get there.

> The common failure mode of AI is also concerning. If you ask it to do something that can't be done trivially or at all, or wasn't present enough in the learning dataset, it wouldn't tell you it doesn't know how to do it. Instead, it'll make shit up with utmost confidence.

I also feel like a lot of people made a lot of conclusions against GPT-3.5 that simply aren't true anymore.

Usually o3 and even 4o and probably most modern models rely a lot more on search results then on their training datasets. I usually even see "I know how to do this but I need to check the documentation for up to date information in case anything changed" in the chain of thought for trivial queries.

But yeah, sometimes you get the old failure mode: stuff that doesn't work. And then you try it and it fails. And you tell it it fails and how. And it either fixes it (90%+ of cases, at least with something powerful like o3), or it starts arguing with you in a nonsensical manner. If the latter, you burn the chat and start a new one, building better context, or just do a manual approach like before.

So the failure mode doesn't mean you can't identify failure. The failure mode means you can't trust it's unchecked output. Ok. So? It's not a finite state machine, it's a statistical inference machine trained on the data that currently exists. It doesn't enter a faliure state. Neither does a PID regulator when the parameters of the physical model change and no one recalibrates it. It starts outputting garbage and overshooting like crazy etc.

But both PID regulators and LLMs are hella useful if you have what to use them for.

show 1 reply