> long contexts are still expensive and can also introduce additional noise (if there is a lot of...

beshrkayali • yesterday at 4:52 PM • 9 replies • view on HN

> long contexts are still expensive and can also introduce additional noise (if there is a lot of irrelevant info)

I think spec-driven generation is the antithesis of chat-style coding for this reason. With tools like Claude Code, you are the one tracking what was already built, what interfaces exist, and why something was generated a certain way.

I built Ossature[1] around the opposite model. You write specs describing behavior, it audits them for gaps and contradictions before any code is written, then produces a build plan toml where each task declares exactly which spec sections and upstream files it needs. The LLM never sees more than that, and there is no accumulated conversation history to drift from. Every prompt and response is saved to disk, so traceability is built in rather than something you reconstruct by scrolling back through a chat. I used it over the last couple of days to build a CHIP-8 emulator entirely from specs[2]. I have some more example projects on GitHub[3]

1: https://github.com/ossature/ossature

2: https://github.com/beshrkayali/chomp8

3: https://github.com/ossature/ossature-examples

Replies

gburgett • today at 12:07 AM

Totally agreed! Ive had good success using claude code with Cucumber, where I start with the spec and have claude iterate on the code. How does ossature compare to that approach?

hansonkd • yesterday at 10:27 PM

I've been thinking a lot about this lately. It seems like what is missing with most coding agents is a central source of truth. Before the truth of what the company was building and alignment was distributed, people had context about what they did and what others did and are doing.

Now the coding agent starts fresh each time and its up to you to understand what you asked it and provide the feedback loop.

Instead of chat -> code, I think chat -> spec and then spec -> code is much more the future.

the spec -> code phase should be independent from any human. If the spec is unclear, ask the human to clarify the spec, then use the spec to generate the code.

What happens today is that something is unclear and there is a loop where the agent starts to uncover some broader understanding, but then it is lost the next chat. And then the Human also doesn't learn why their request was unclear. "Memories" and Agents files are all ducktape to this problem.

comboy • yesterday at 6:07 PM

Hey, you seem to have similar view on this. I know ideas are cheap but hear me out:

You talk with agent A it only modifies this spec, you still chat and can say "make it prettier" but that agent only modifies the spec, the spec could also separate "explicit" from "inferred".

And of course agent B which builds only sees the spec.

User actually can care about diffs generated by agent A again, because nobody wants to verify diffs on agents generated code full of repetition and created by search and replace. I believe if somebody implements this right it will be the way things are done.

And of course with better models spec can be used to actually meaningfully improve the product.

Long story short what industry misses currently and what you seem to be understanding is that intent is sacred. It should be always stored, preferably verbatim and always with relevant context ("yes exactly" is obviously not enough). Current generation of LLMs can already handle all that. It would mean like 2-3x cost but seem so much worth it (and the cost on the long run could likely go below 1x given typical workflows and repetitions)

➕ show 2 replies

Yokohiii • yesterday at 5:23 PM

I like it a lot, I find the chat driven workflow very tiring and a lot of information gets lost in translation until LLMs just refuse to be useful.

How does the human intervention work out? Do you use a mix of spec and audit editing to get into the ready to generate state? How high is the success/error rate if you generate from tasks to code, do LLMs forget/mess up things or does it feel better?

The spec driven approach is potentially better for writing things from scratch, do you have any plans for existing code?

➕ show 1 reply

4b11b4 • today at 12:04 AM

nice but can't be only text based

peterm4 • yesterday at 5:40 PM

This looks great, and I’ve bookmarked to give it a go.

Any reason you’ve opted for custom markdown formats with the @ syntax rather than using something like frontmatter?

Very conscious that this would prevent any markdown rendering in github etc.

➕ show 1 reply

straydusk • yesterday at 11:42 PM

This is basically what Augment Intent is

dboreham • yesterday at 6:54 PM

Waterfall!

➕ show 1 reply

alt Hacker News

Replies