logoalt Hacker News

danfritzlast Wednesday at 2:18 PM12 repliesview on HN

Every time I see a post like this on HN I try again and every time I come to the same conclusion. I have never see one agent managing to pull something off that I could instantly ship. It still ends up being very junior code.

I just tried again and ask Opus to add custom video controls around ReactPlayer. I started in Plan mode which looked overal good (used our styling libs, existing components, icons and so on).

I let it execute the plan and behold I have controls on the video, so far so good. I then look at the code and I see multiple issues: Over usage of useEffect for trivial things, storing state in useState which should be computed at run time, failing to correctly display the time / duration of the video and so on...

I ask follow up question like: Hide the controls after 2 seconds and it starts introducing more useEffects and states which all are not needed (granted you need one).

Cherry on the cake, I asked to place the slider at the bottom and the other controls above it, it placed the slider on the top...

So I suck at prompting and will start looking for a gardening job I guess...


Replies

thewillowcatlast Wednesday at 3:12 PM

These posts are never, never made by someone who is responsible for shipping production code in a large, heavily used application. It's always someone at a director+ level who stopped production coding years ago, if they ever did, and is tired of their engineers trying to explain why something will take more than an hour.

show 1 reply
weitendorflast Wednesday at 3:15 PM

Back in the day when you found a solution to your problem on Stackoverflow, you typically had to make some minor changes and perhaps engage in some critical thinking to integrate it into your code base. It was still worth looking for those answers, though, because it was much easier to complete the fix starting from something 90% working than 0%.

The first few times in your career you found answers that solved your problem but needed non-trivial changes to apply it to your code, you might remember that it was a real struggle to complete the fix even starting from 90%. Maybe you thought that ultimately, that stackoverflow fix really was more trouble than it was worth. And then the next few times you went looking for answers on stackoverflow you were better at determining what answers were relevant to your problem/worth using, and better at going from 90% to 100% by applying their answers.

Still, nobody really uses stackoverflow anymore: https://blog.pragmaticengineer.com/stack-overflow-is-almost-...

You and most of the rest of us are all actively learning how to use their replacement

show 2 replies
ammutlast Wednesday at 2:47 PM

I've spent quite a bit of time with Codex recently and come to the conclusion that you can't simply say "Let's add custom video controls around ReactPlayer." You need to follow up with a set of strict requirements to set expectations, guard rails, and what the final product should do (and not do). Even then it may have a few issues, but continuing to prompt with clearly stated problems that don't meet the requirements (or you forgot to include) usually clears it up.

Code that would have taken me a week to write is done in about 10 minutes. It's likely on average better than what I could personally write as a novice-mid level programmer.

show 3 replies
enraged_camellast Wednesday at 2:55 PM

I find anecdotes like yours bewildering, because I've been using Opus with Vue.js and it crushes everything I throw at it. The amount of corrections I need to make tend to be minimal, and mostly cosmetic.

The tasks I give it are not trivial either. Just yesterday I had it create a full-blown WYSIWYG editor for authoring the content we serve through our app. This is something that would have taken me two weeks, give or take. Opus looked at the content definitions on the server, queried the database for examples, then started writing code and finished it in ~15 minutes, and after another 15-20 minutes of further prompting for refinement, it was ready to ship.

show 1 reply
gejoselast Wednesday at 2:54 PM

I used to run into this quite a bit until I added an explicit instruction in CLAUDE.md to the effect of:

> Be thoughtful when using `useEffect`. Read docs at https://react.dev/learn/you-might-not-need-an-effect to understand if you really need an effect

phnlast Wednesday at 5:55 PM

Have you tried Roo Code in "Orchestrator" mode? I find it generally "chews" the tasks I give it to then spoon feed into sub-tasks in "Code" (or others) mode, leaving less room to stray from very focused "bite-sized" changes.

I do need to steer it sometimes, but since it doesn't change a lot at a time, I can usually guide the agent and stop the disaster before it spreads.

A big caveat is I haven't tried heavy front-end stuff with it, more django stuff, and I'm pretty happy with the output.

andailast Wednesday at 5:40 PM

I have a vanilla JS project. I find that very small llms are able to work on it with no issue. (Including complete rewrites.) But I asked even large LLMs to port it to React and they all consistently fail. Basic functionality broken, rapid memory leaks.

So I just stuck with vanilla JS.

n = 1 but React might not be a great thing to test this stuff with. For the man and the machine! I tried and failed to learn React properly like 8 times but I've shipped multiple full stack things in like 5 other languages no problem.

doubleorsevenlast Wednesday at 3:08 PM

usually for me, after a good plan is 90% solid working code. the problem do arise when you ask it to change the colors it choose of light grey text over a white background. this thing still can't see and it's a huge drawback for those who got used to just prompting away their problems

mnky9800nlast Wednesday at 2:59 PM

I always assume the person either didn't use coding agents in a while or its their first time. don't get me wrong, i love claude code, but my students are still better at getting stuff done that i can just approve and not micromanage. thats what i think everyone is missing from their commentary. you have to micromanage a coding agent. you don't have to micromanage a good student. when you dont need to micromanage anymore at all, that's when the floor falls out and everyone has a team of agents doing whatever they want to make them all billionaires or whatever it is AI is promising to do those days.

show 1 reply
hu3last Wednesday at 4:48 PM

Yep. It sucks. People are delusional. Let's ignore LLMs and carry on...

On a more serious note:

1) Split tasks into smaller tasks just like a human would do

Would you bash your keyboard for an hour, adding all video controls at once before even testing if anything works at all? Ofc not. You would start by adding a slider and test it until you are satisfied. Then move to next video control. An so on. LLMs are the same. Sometimes they can one-shot many related changes in a single prompt but the common reality is what you experienced: it works sometimes but the code is suboptimal.

2) Document desireable and undesireable coding patterns in AGENTS.md (or CLAUDE.md)

If you found over usage of useEffect, document it on AGENTS.md so next time the LLM knows your preference.

I have been using LLMs since Sonet 3.5 for large enterprise projects (1kk+ lines of code, 1k+ database tables). I just don't ask it to "draw the rest of owl" as the saying goes.

jf22last Wednesday at 2:19 PM

So? Getting a months' worth of junior level code in an hour is still unbelievable.

show 1 reply
animanoirlast Wednesday at 2:49 PM

[dead]