logoalt Hacker News

The short leash AI coding method for beating Fable

73 pointsby Riseedyesterday at 7:11 PM75 commentsview on HN

Comments

sothatsityesterday at 10:13 PM

This “short leash” seems like more of a crutch to me, and a sign of not giving the AI enough detail on the problem to begin with, or not reviewing and iterating on its output.

Hand-holding great models like Fable through implementation is a waste of time, and a waste of Fable. You can have increasingly nuanced discussions with stronger models, and they write a lot better code than they used to. The process of discussing designs and their implementations, questioning things that look weird to you, and actually reading the AI’s responses also helps to find better solutions.

For example, one time I wanted to write a greedy solver for a problem, and in my discussion with Opus on the idea it suggested using an existing MILP library to solve the problem exactly. I’d never even heard of MILP, but my final implementation ended up being better and simpler than what I’d have done alone.

show 2 replies
nateburketoday at 3:11 AM

Seems like a common-sense approach. I appreciate the emphasis on understanding, humans will eventually be held accountable, blaming Claude for an outage is not going to get Claude fired.

ed_merceryesterday at 10:41 PM

I feel like OP is still in the year 2025.

> The AI will have gone off the rails multiple times and you will only notice it later when you actually try to use the software.

Except that said AI can now themselves use your software and find and fix bugs themselves, not to mention drive new features.

>Your agent might go “off the rails” and start doing something you don’t want it to do

This happens but far less often than it used to, and the case for full autonomous agents is getting stronger, not weaker.

>It is humanly impossible to build your own understanding of a codebase

This again feels outdated. I think we're mving towards humans no longer needing to understand a codebase, and letting AI drive it.

show 5 replies
heohktoday at 3:09 AM

They can generate stuff outside their training by consuming and regurgitating documentation. Thunkign

jonplackettyesterday at 9:52 PM

I thought this was how everyone who can actually code uses AI for anything that’s actually important.

Am I wrong? Are you guys just YOLOing everything these days?

show 2 replies
moezdyesterday at 10:10 PM

LLMs are still next token predictors, just because you can give it more vague instructions and it still finds the right steps to follow, it doesn't mean it's intelligent. It means you're speaking the same language as the harness they trained your model on.

And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.

show 5 replies
afro88yesterday at 10:36 PM

Maybe I'm too optimistic, but given appropriate skills and references (not just for writing but also reviewing) and intelligent use of subagents for isolated reviews and checks, you can lengthen the leash a bit.

But you still need to properly review plans and PRs to keep a good mental model of the codebase. This effectively limits the number of tasks being done in parallel to maybe 2-3. Though you'll be mentally exhausted and probably start to make mistakes or take shortcuts in reviews yourself.

sscaryterryyesterday at 7:24 PM

There really wasn't much substance to this article.

show 1 reply
fnyyesterday at 11:05 PM

AI is a junior to mid-level engineer. If you treat it as such, you get the best of both vibe coding and rigorous engineering without all this paranoia.

Since the very beginning I've ran Claude from an isolated VM on yolo mode. This is just like giving an engineer their own laptop. Claude works on a feature up to a PR worthy point. I review the diff, just like I would with another engineer, and massage it to get it in the right shape and move on.

Inexperienced engineers make the same mistakes described I've even seen rm -rf albeit not from root! I would have lost my mind micromanaging someone with all permissions denied.

show 2 replies
giancarlostoroyesterday at 11:51 PM

Here I thought this was about Fable the video game, then I remembered Anthropics model got named Fable. It's going to be painful to google one of my favorite game series, just like googling "Rust server" does not give you Rust programming results, but Rust the video game results. I wish google would have fixed this problem long ago, it seems like something trivial for them to fix.

show 1 reply
codyswanntoday at 1:46 AM

Nothing I haven’t read 1,000 times before.

rybosworldtoday at 2:19 AM

I'm convinced that even if/when ASI is achieved we will still have mediocre engineers writing blog posts about how they have uncovered the secrets to using these tools "effectively".

WhitneyLandyesterday at 10:23 PM

This post seems like some decent advice mixed in with a lot of overconfidence and unverifiable claims.

“expert developers whose skills have reached the point where they outclass any and all “frontier AI models” in their area of expertise”

Are any developers saying they outclass any and all frontier models? I’d say at best it’s mixed at this point. The best developers still do certain things better, but not even close to all things.

“The problem is that even code written and/or reviewed by Fable 5, will stink”

I’m skeptical. Example prompt and output please.

steezeburgeryesterday at 10:12 PM

I find it hard to stay engaged doing this. I do get good results, but it's just hard to not get distracted when it's doing the work.

show 1 reply
bonsai_spoolyesterday at 9:35 PM

I'm curious whether Opus4.8 or similar can attain Mythos level through good system prompting and steering? You would expect this to work if it's true that the strength of Mythos is its unwillingness to quit before it gets a desired outcome

show 2 replies
codemogtoday at 2:39 AM

Why not just write the interfaces yourself and let the AI do the implementation at that point?

YuechenLiyesterday at 11:38 PM

I mean, the key is to stop trying to one-shot everything: The main problem I found with LLM code is more that they always try to take the shortest path to the solution possible, so a lot of time Codex would write code that meets the requirements of the prompt but misses something that cause it to not work in the non-ideal scenario.

The solution for that is pretty easy too, it's just iteration: you describe the exact problem you have with the code and why it is not running correctly and ask them to provide a narrow fix that addresses the bug. It's not that complicated.

CamperBob2today at 1:37 AM

FTA: Contrary to marketing statements made by certain CEOs, these models are not able to think beyond their training data.

The sheer cognitive dissonance needed to say something like that at a time when AI is delivering novel math proofs is... well, not actually impressive. Mostly, it's just sad.

Some part of him must know such a statement is not true, or more properly, that it's meaningless. But he says it anyway, because he thinks it makes an impression of insight and erudition on the listener.

If you think what it does is brilliant, you're not ready (to use AI.)

At some point in one's journey to engineering enlightenment, one recognizes how rarely "brilliance" is actually called for, and indeed how counterproductive such self-judged "brilliance" often turns out to be in the long run.

Clearly the author is still striving to reach this particular stage.

kissgyorgyyesterday at 9:56 PM

This is probably slower than writing the code yourself. Doesn't make sense to me. Using an agent without YOLO mode is not wort it.

The way I rather do it is tightly control the output by skills written yourself, prompts, plans, etc. and have the closest possible outcome you would write yourself.

show 1 reply
8noteyesterday at 11:01 PM

... fable on the restart seems to be more like opus and very turn limited?

if you want to beat it, give it more turns before it has to "wrap up a session"

hungryhobbityesterday at 10:02 PM

I <3 how everyone and their brother feels qualified to write advice to hundreds? thousands? of other developers about AI ... based on a couple months of experience as a personal user.

I mean, it's like writing a book about how to use React or Django or some other major software ... after you used it for one project for a month!

Authors: I know this is the Internet, and I know bloggers blog about whatever pops into their head ... but if you are going to act like an authority, how about you learn more than the average reader before you start telling them authoritatively what to do?

show 3 replies
roshandxtyesterday at 11:10 PM

[flagged]

cws_ai_buddyyesterday at 9:56 PM

[flagged]

avereveardyesterday at 9:52 PM

Seems hella inefficient.

Better method start to realizing that everything that every program do is data transformations and or movement

Then you ask llm to subdivide data in a tree along the domain model, classifing streaming vs storing nodes

Then for each node you discuss with the ai for the best data structure

Then you ask for an interface that fully encapsulate the structure and every mutation only allows to go from a valid state to a valid state and bidding else is allowed to touch the state

And that's mostly it just connect all the interfaces until input goes to monitor or to storage or to api or wherever the destination is

show 2 replies