Wasn't it now (end of 2025) that Dario Amodei said Claude (or LLMs in general) would be doing almost all programming work?
This article is my typical experience with LLM coding. Endless correction and handholding, and manual cleanup of subtle mistakes. With no long-term learning from them.
Kinda makes me livid, the amount of false hype coming out of the mouths of the stewards of these investor-subsidized LLM companies.
But they're amazing Google replacements, and learning tools. And once in a blue moon they ace a coding assignment and delight me.
Edit: 90% of coding work by June to September 2025: https://www.businessinsider.com/anthropic-ceo-ai-90-percent-...
We've lost the capability to build such marvels.
https://knowyourmeme.com/memes/my-father-in-law-is-a-builder...
I would put Claude into a loop and let it make screenshots itself, diffing them against the original screenshot, until it has found the right arrangement at the planets‘ starting position (pixel perfect match).
Them: AI will take jobs.
The AI: https://chatgpt.com/share/6923df03-7304-8010-bd08-cd335f0ee9...
> here's no other way to do it besides getting Claude to recreate it from a screenshot
And
> I'm an engineering manager
I can't tell if this is an intentional or unintentional satire of the current state of AI mandates from management.
new LLM benchmark just dropped. 'draw an svg of a pelican riding a bicycle browsing spacejam 1996 on 640x480 ie6'.
A comparison would Codex would be good. I haven't done it with Codex, but when working through problems using ChatGPT, it does a great job when given screenshots.
Why not just feed it the actual instructions that create the site - the page source code, the HTML, CSS, JS if any?
I wouldn't call it entirely defeated, it got maybe 90% of the way there. Before LLMs you couldn't get 50% of the way there in an automated way.
> What he produces
I feel like personifying LLMs more than they currently are is a mistake people make (though humans always do this), they're not entities, they don't know anything. If you treat them too human you might eventually fool yourself a little too much.
What if you gave it an image comparison tool that would xor two screenshots to check its work?
> The total payload is under 200KB.
Just out of curiosity, how big was what you considered Claude's best attempt to be?
I have a very weird tangential nit to pick: gendering LLMs. I swear I'm not pushing any sort of gender agenda/discussion that can be had anytime anywhere else in the current age, but to me there is something quintessentially a-gendered about the output of a computer program.
Calling Claude (or GPT-5 or Gemini or my bash terminal for that matter) a "he" seems absurd to the point of hilarity.
In my mind, they've always firmly been "it"s.
My web-dev friend saw the original Space Jam site. I asked him what it would cost to build something like that today. He paused and said:
We can’t. We don’t know how to do it.
Would be interesting to see whether Gemini could crack this problem.
In actual workflows someone would accept a very close reproduction and fix the small issues. Generally I use systems to get close enough to a scaffolding and / or make small incremental improvements and direct its design
Skill issue
Look at that stupid dog. It's reading a book, but it's really trashy YA. It's not even Shakespeare. Dogs are stupid.
Apropos given Warner Brothers Discovery just sold to Netflix
I'm curious. Did you ask it to use tables and no CSS?
In 1996, We had only css1. Ask it to use tables to do this, perhaps.
Honestly, if you had showed this article to me even eighteen months ago, I would have been blown away at how good of a job Claude did.
It's remarkable how high our expectations have been steadily creeping.
This basically boils down to AI being unable to "center a div". I see this very often, AI generated slop is has LOTS of "off by one" kind of bugs.
maybe ask it to use 1990s table based layout approaches?
Use Claude for Python. That's it. Don't push it for the frontend, it won't do well.
Why not just host a copy from waybackmachine?
Tell claude to put the screenshot as an centered image with the body having the starry background on repeat. Then define the links as boxes over each icons with an old little tech trick called an image map.
Common at the time before flash took over.
I wrote a 20,000 line multiplayer battle-arena game in XNA back in 2015 with manually coded physics (so everything is there in the code) and have tried several times with Claude, Gemini, Grok, DeepSeek, and GPT to translate it to JavaScript.
They all fail massively 100% of the time. Even if I break it down into chunks once they get to the chunks that matter the most (i.e. physics, collision detection and resolution, event handling and game logic) they all break down horribly and no amount of prompting back and forth will fix it.
Hmm you note that the problem is the LLM doesn’t have enough image context, but then zoom the image more?
Why not downscale the image and feed it as a second input so that entire planets fit into a patch and instruct it to use the doensampled image for coarse coordinate estimation
I keep wondering ... is this a good benchmark? What is a practical use-case for the skills Claude is supposed to present here? And if the author needs that particular website re-created with pixel-perfect accuracy, woulnd't it me simpler to just to it yourself?
Sure, you can argue this is some sort of modern ACID-Test - but the ACID tests checked for real-world use-cases. This feels more like 'I have this one very specific request, the machine doesn't perfectly fullfill it, so the machine is at fault.'. Complaining from a high pedestal.
I'm more surprised at how close Claude got in its reimagined SpaceJam-site.
I personally don't understand why asking these things to do things we know they can't do is supposed to be productive. Maybe for getting around restrictions or fuzzing... I don't see it as an effective benchmark unless it can link directly to the ways the models are being improved, but, to look at random results that sometimes are valid and think more iterations of randomness will eventually give way to control is a maddening perspective to me, but perhaps I need better language to describe this.
"there is no other way to preserve it"
Bullshit. Right click -> view source
Or just press ctrl+s and the browser will also gather all the assets into a folder for you.
The arrogance of thinking that the only way you know how is the only way....
You literally forgot the save feature all browsers have just because you set out to "solve" this using "ai"
Why do I feel like the old man yelling at clouds that programmers refuse to use their brains anymore?
this is just AI brainrot disease
Help, I can't recreate a website with AI! There's no other way, no way I could fix up some HTML code! Believe me, I'm an engineering manager with a computer science degree!
Absolutely disgusting.
[dead]
[dead]
[dead]
[dead]
[flagged]
Somehow I suspect Claude Code (in an interactive session with trial, error, probing, critiquing, perusing, and all the other benefits you get) would do better. This example seems to assume Claude can do things in "one shot" (even the later attempts all seem to conceal information like it's a homework assignment).
That's not how to successfully use LLM's for coding in my experience. It is however perhaps a good demonstration of Claude's poor spatial reasoning skills. Another good demonstration of this is the twitch.tv/ClaudePlaysPokemon where Claude has been failing to beat pokemon for months now.
Why involve an LLM in this? Just download the site?
You last-minute cancelled coffee with your friends to work on this? I'm not sure how I would feel if a friend did that to me.
> Note: please help, because I'd like to preserve this website forever and there's no other way to do it besides getting Claude to recreate it from a screenshot.
Why not use wget to mirror the website? Unless you're being sarcastic.
$ wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org
Source: https://superuser.com/questions/970323/using-wget-to-copy-we...
We don't know how to build it anymore