logoalt Hacker News

Ask HN: What was your "oh shit" moment with GenAI?

310 pointsby andrehackerlast Thursday at 11:42 PM569 commentsview on HN

Most of us were amused when DALL-E and its peers went mainstream, and we were quick to point out the obvious flaws.

Then ChatGPT hit the scene and again, many of us dismissed it as a parlor trick that would never amount to much.

Using LLMs for coding initially was a only small step up from basic code completion, and a welcome farewell to Stack Overflow.

I am curious: what was the specific moment that you went from those quaint, dismissive observations to a slightly panicked, "Uh Oh" realization of what these models can do?


Comments

staredyesterday at 11:54 PM

GPT4, when it could do a translation that would take a considerable human effort, vide "Genesis 1 but every word begins with 'A'": https://p.migdal.pl/blog/2023/05/genesis-az-by-gpt/

tezzayesterday at 9:22 PM

MidJourney public discord channel.

The amount of masterpiece level art flowing per hour was astounding.

For every one doing a ninja waifu, there were ten doing art from davinci and leonardo crossed with hockney.

it almost gave you art sickness

zhoBEENGyesterday at 1:06 AM

It was when I first saw an LLM reliably make tool calls to bash.

1qaboutecsyesterday at 8:19 PM

Was trying to explain convolution (of functions) to a friend and I wanted to build a little picture. I typed more or less nothing into Claude and it gave me a fine web-app for demo'ing examples to my friend within minutes.

Three years ago this would have taken a minimum of three college graduates a couple days -- one to know the math, one to know the backend, and one to know the front-end. Maybe two of those could be the same person on a good day -- none of the topics is individually that hard -- but it's a lot together.

magarnicletoday at 3:13 AM

Being able to make large alterations to ffmpeg even though I'm a 2/10 C programmer.

The most impressive was speeding up the drawtext filter by at least 10x.

mbirthtoday at 1:26 AM

Running ComfyUI and some ImageGenAI and realising how you can use it to generate anything from any aspect of pr0n and various fetishes to making up fake news about basically anything. And real enough to convince the masses.

chasd00yesterday at 8:49 PM

i was a skeptic and then, on a whim, i told claudecode to "create an app with a react front end and python api backend that delegates auth0.com and allows users to manage a todo list" or something like that. Like a standard issue web app with a database, backend, frontend, openid and all that. i was pretty impressed with the result.

Then i asked it to create a multi-user stock market portfolio simulator with a comprehensive api, leaderboard, scheduled tasks and the other bells and whistles. Again, fairly impressed with the result. Then I prompted it to build an trading bot that uses the API to compete with the human players, again fairly impressed with the result.

Last, i prompted my way through a react native mobile app integrated with supabase for my sister's startup. It created the schema, some triggers, webhook for stripe, all the app views, setup an expo account, push notifications, prompted _me_ through an Apple developer account and everything else.

All of this was done an hour here and an hour there while making dinner or watching TV, barely any attention paid to the details. Just prompting claudecode and checking what it did.

After those three experiences I started incorporating claudecode into all my coding workflows and managed to get my job to buy me a license for work stuff too.

knuckleheadsyesterday at 7:52 PM

I remember a couple months after ChatGPT came out I was in a 1-1 with a coworker who hadn’t really played around with it much. I was very much toying around with it and was surprised at how good at stuff it was. I wanted to show him it was for real, he was skeptical, so over a half hour we had it make a bee and a flower buzz around in d3, copying and pasting between jsfiddle and ChatGPT. By the end of it, we had a nice animation and were both throughly surprised that the computers could code so well now.

syxyesterday at 10:48 PM

I couldn’t make a Rockbox (the alternative iPod OS) simulator run on my MacBook M2 no matter how many guides I followed, then I fired up Claude code and by modifying the original source code it made the simulator run and I was able to start developing custom plugins for my iPod. It honestly felt great since I only have basic C knowledge.

jphil529yesterday at 9:52 PM

Getting the agent to write end-to-end tests but from the perspective of a user really shocked me. I only give the agent access to site via web and block access to the source code.

It's helped me to gain a level of trust that the agent isn't just writing the test to pass. That in turn allowed me to step back a lot and trust more of the output and let it run longer and on bigger problems.

oceanskyyesterday at 8:59 PM

Ovid's unicorn gpt-2 article in 2019 really amazed me.

card_zeroyesterday at 10:00 PM

It was about two days after Google released Deep Dream, if you remember, the thing that took a video and filled it with fleeting hallucinations of mostly puppies, fish heads and lizards. I was suddenly struck by the realization "oh shit, this is much more boring and samey than it first appeared to be", and all subsequent gen AI has been similarly underwhelming.

zarzavatyesterday at 10:50 PM

It was when I was using an early version of GitHub Copilot. At first the completions were almost useless and had a kind of copy and paste feel, however one day it managed to reason thorough a complicated loop body much faster than I could have figured it out. It was at that moment I realised this AI thing was going to be big.

veschetoday at 5:00 AM

Three moments stick out to me.

1) When I used ChatGPT for the very first time. I still remember, I asked it: “Write an advertisement to convince people to visit the North Pole.” It rapidly returned a witty, accurate, multi-paragraph text of exactly what I wanted and exceed my expectations. ChatGPT was the beginning of the modern AI boom and I remember being immediately impressed.

2) When I was working at GitHub, the copilot team gave the engineering team early access to copilot in VS Code. I can distinctly remember seeing the chat window in the code editor for the first time. I was probably one of the first people ever to see it. I remember playing with it a bit and asking simple Python questions. I knew that day that StackOverflow was dead and my mind was blown.

3) Big oh shit moment earlier this year that I believe for me started with the Opus 4.6 model + Cursor. The results were noticeably better, hallucinated much less, could solve complex problems with much less intervention. Early 2026 was a turning point for me as an engineer with AI. Throughout 2025, I was still writing the vast majority of my code by hand like I’ve always done- that is not that case in 2026.

Quitschquatyesterday at 11:36 PM

"I" code impressive shit with the LLM, but after the initial push to github, I find I hate myself and I'm deeply miserable with what it produced since it was not mine. My "ah-ha" moment has been that misery.

rineshtoday at 5:28 AM

The most recent one more me has been Codex Computer-Use

wseqyrkuyesterday at 9:18 PM

After Attention is All You Need I realized if you just really pay attention to what you're doing you can actually get it done.

anon373839yesterday at 8:42 PM

Mine was when I used Stanford Alpaca, and realized that they had transformed Llama 7B into a credible facsimle of ChatGPT with just $600.

mjdtoday at 2:52 AM

It was something really silly: I asked Claude to help me think of a snide emoji for every U.S. President.

I hadn't been able to think of one for Zachary Taylor, because, you know, he's Zachary Taylor.

Claude proposed the cherries emoji, because it's said that Taylor the war hero died a ridiculous death from eating cherries and ice milk too greedily on a hot day. It was perfect, just what I had been looking for.

Claude gave me a couple of others, and we workshopped a few more. It was the workshopping that was most striking. I really felt like I was having a conversation with someone else.

https://blog.plover.com//tech/gpt/presidential-emoji.html

gwbas1cyesterday at 8:54 PM

When I don't know how to use a specific API, or how to do a task, I'll often give some high-level instructions to Copilot (Claude's model) in Visual Studio, and then review what it comes up with very, very closely. (Including lookup up specs so I can confirm that it did it correctly.)

It's much, much faster and easier than starting from scratch.

nsikorryesterday at 8:14 PM

Definitely the first NotebookLM podcast I generated.

Legend2440yesterday at 9:40 PM

MidJourney v3. By today's standards the images were crude and smudgy, but you could tell that it actually understood what objects were and what words visually meant.

I've been working with computers for a long time, and this was the first time in a long time I'd seen software do something genuinely new.

hereme888yesterday at 8:38 PM

Creating a functional python app with zero programming knowledge, back in the days of GPT 3.5.

That was enough to awaken my teenage hacker spirit.

sct202yesterday at 7:51 PM

One of our SAAS providers launched an AI agent enabled version, and it can follow direction and do tasks & manipulate data/settings in the software like on par with a below average person. When I used it I had a sinking feeling, tons of teams and people will be redundant as these agents improve and roll out to other software.

eranationtoday at 1:57 AM

Realising in a recent benchmark that gpt-5-mini gives better results on some tasks than gpt-5.4-mini and event gpt-5 or gpt-5.5

sajithdilshanyesterday at 9:48 PM

For me it was last February or so when I started using Opus.

But today I watched a video from Andrej Karpathy on YouTube on how LLMs works and my illusions got completely shattered. Turns out they are a glorified autocomplete. All the engineering happens actually on the harness

show 1 reply
arjieyesterday at 9:01 PM

2 years ago, wrote superfast float -> fixed point string code. That was cool.

Then a while ago, I plugged in everything at the datacenter and one device didn't come up. Plug into the management port, and Claude Code writes a C program to send a particularly crafted packet. Everything comes online.

Beautiful stuff.

dsr_today at 4:39 AM

I asked Claude to explain how the lyrics of "Birdhouse in Your Soul" by They Might Be Giants should guide investment strategy. It promptly produced five paragraphs of bullshit that read just like a persuasive essay on the Net.

If you don't firmly hold in your mind "this is a bullshit generator", you can get in real trouble fast.

cheevlyyesterday at 8:30 PM

Ever since the first Davinci model of GPT-3 ive literally been using LLMs daily. It was an indispensable tool for me from the very beginning and despite 10,000+ hours of usage and research, I still feel like ive barely cracked the surface of whats possible with current genai tech.

banannaiseyesterday at 9:28 PM

Every time I review a new PR to my codebase, I go "oh shit, these unit tests are garbage, they've clearly been vibecoded" and tell the contributor to rewrite the unit tests so they do more than just game the coverage metrics.

atleastoptimalyesterday at 9:21 PM

It was interacting with GPT-4 and it produced an original sentence that existed nowhere I could find. I realized that being able to do that was the "nugget" of intelligence that all improvements since could be built on

novaleaftoday at 12:02 AM

just yesterday I felt that claude code was being aggressive in it's defense, so I lead my response with "Spicy Take! Here's why I think the bug is happening...."

Because of syncopathy it took my "Spicy Take" and decided to say basically "Even more than it could, your bug is happening RIGHT NOW"... which was just made up lies for dramatic fit.

Back to talking to Claude like I'm a robot I guess.

erelongtoday at 1:21 AM

I was never dismissive, it always seemed pretty cool at each step

Maybe in 2024 I was amazed to see it one shot unique snippets of code

jszymborskiyesterday at 9:25 PM

There was a viral Medium post that was about LLMs but then there was a reveal at the end was that the whole thing was a ChatGPT post. That was my first "wow" moment.

It was on hackernews... anyone know what I'm talking about?

show 1 reply
virtualblueskytoday at 4:44 AM

Why is it that nobody discusses uploading all the company's IP to service providers that built their service by 'creatively interpreting' IP ownership?

dyauspitryesterday at 7:39 PM

I was trying to replace my koi pond pump last weekend and the model numbers on it had washed away. I took a picture of it and it immediately narrowed it down to two models but wasn’t sure if it was the 4500 model or the 2500 model. I asked it how I can determine which one it was. It then asked me to measure the length and that the 4500 was 11 inches and the 2500 was 9 inches. Mine was 11. It was cool it was able to reason that out and give me something actionable.

It’s kind of a trivial example but there are multiple instances of this per week with the wide variety of things I do around my property.

show 1 reply
wpsyesterday at 8:42 PM

Nvidia GauGAN and deep-daze amused me immensely at the age of 14 or so. I've had "a man painting a completely red image" saved for a long time.

It is insane how primitive modern inpainting and txt2image make these two projects look.

iLoveOncallyesterday at 9:47 PM

I'm still waiting for a positive "Oh shit" moment regarding LLMs.

I've had plenty of "Oh shit those people have really lost all ability to think for themselves" moments though.

flysonic10yesterday at 10:59 PM

There were two:

1) When I was testing one of the early coding agents, I gave it admin keys to a fresh AWS account and it configured everything beyond just building a demo site. That was, "oh shit, tool-use is going to be the killer feature of GenAI."

2) When I was still skeptical of the system as just a more-or-less dumb statistical predictor of the next token/word, I read the argument that even if it is a statistical predictor, the fact that it can reason means the intelligence is necessarily baked into the statistical model somewhere. That was "oh shit, intelligence is actually modeled."

ieie3366yesterday at 8:45 PM

I'm a terrible cook, but just by using Claude as a tutor I've managed to make 5 different recipes in a row and they all tasted fantastic, restaurant quality.

ramshankertoday at 1:41 AM

I can count 2:

Dec 2025: We use a commercial 3D modeling software to build refinery. There was no license dashboard in this ancient piece of junk. Fortunately license server provided verbose live status report through a command line. I ask ChatGPT to ingest the logs into a Django web application and generate weekly/monthly/yearly usage dashboard, and It one shorted the whole Backend + Frontend in 4 to 5 shot. There were around 10 regexes just in the log parsing batch script. I was totally speechless. Encouraged by the success of, I went ahead and made the dashboard for 3 more software in the same Django app. Released to peers by evening, feedback incorporated in 2 days to integrate Name, Employee Number, IP Address sync etc in 2 days. And it’s been live for 5 months, actively being used by all coadmins, even management has it bookmarked, to help with department redistribution. Making this thing without AI would have taken well over a month of “learning new stuff”, or paying external consultants too much. Even head of IT replied back, it was awesome. ;)

2nd , June 2026: I asked codex to something fairly complex before going to morning bath!, which would have taken me more than a week of learning DirectX12 API nuances and such things, 20 min latter, I return to task exactly completed with code changes in 5 different files. Build complete without any error. OMG. Free Quota over for whole month! I subscribed by the evening.

sowbugyesterday at 11:01 PM

One concrete and one abstract.

Concrete: Last year I was DIYing a solar-power system for my home. I spent about an hour spitting out a Python tool that took (as inputs) drone photos and JSON and generated several proposed roof layouts for the panels and conduit. The tool helped me identify the exact railing attachment points and route around existing roof obstructions. Professionals already have these tools, and maybe they're available to DIYers, but you know what? It was faster to build my own than to do the product research on the web.

Abstract: This "oh shit" was more of a slow burn than a sudden realization. I see a lot of angst from developers who complain about their LLM agents. Agents write terrible code that barely works. They say things are done when they aren't. They misinterpret feature requests and ignore clear-cut project rules. They make assumptions that would have taken three seconds to research and invalidate. They suddenly quit because we're not paying them enough. And so on.

But you know what? All those complaints apply to humans, too! The industry has been dealing with these problems forever. Many of the same management techniques and software-development processes apply. This is why I discount a certain class of criticism about AI-generated code. If a fault of an LLM applies equally well to human engineers, and the person voicing the criticism hasn't managed a team, then I'd invite that person to wear a management hat for a while. Read some books/blogs, talk to an EM. Maybe this is a skill issue, which matters because we're all managers now.

The "oh shit" for me is that I have yet to hear a criticism that I can't map to one or more actual engineers I've worked with -- eventually successfully -- in my career. Which means that I'm still waiting for a new criticism, and eventually absence of evidence might be evidence of absence. LLMs fit too well into the giant machine of commercial software development for them to be a parlor trick.

autophagianyesterday at 9:52 PM

I think I couple years ago, I asked it to write me a nom parser for some system metrics I wanted to consume, and it one shot it. Thought “oh”. And here we are.

ChiperSofttoday at 1:29 AM

We had a company hackathon in the fall of 2023. One of the teams did a project where the pulled a bunch of expense data out of the DB, shoved it into a prompt, and asked ChatGPT to summarize the expenses and give recommendations. They then treated the output as if it were factual, without validating any of the results, and talked about turning it into a customer product.

That was my oh shit moment. As in "oh shit, they think this random text generator can reason and think."

That was pretty much the writing on the wall for me.

kami23yesterday at 9:46 PM

Seeing subagents working in Claude last summer, I saw it and told myself my job is going to be different and I can automate the hell out of my workflow

utopcelltoday at 3:18 AM

Gold medal @ the 2025 International Math Olympiad.

🔗 View 50 more comments