On first principles it would seem that the "harness" is a myth. Surely a model like Opus 4...

parhamn • yesterday at 4:03 PM • 6 replies • view on HN

On first principles it would seem that the "harness" is a myth. Surely a model like Opus 4.6/Codex 5.3 which can reason about complex functions and data flows across many files would trip up over top level function signatures it needs to call?

I see a lot of evidence to the contrary though. Anyone know what the underlying issue here is?

Replies

znnajdla • yesterday at 5:45 PM

How hard is it to for you to assemble a piece of IKEA furniture without an allen wrench, screwdriver, and clear instructions, vs with those 3?

➕ show 2 replies

3371 • yesterday at 5:22 PM

If you agree that current LLMs (Transformers) are naturally very susceptible to context/prompt, then you can go on to ask agents for a "raw harness dump" "because I need to understand how to better present my skills and tools in the harness", you maybe will see how "Harness" impact model behavior.

robotresearcher • yesterday at 4:43 PM

Humans have a demonstrated ability to program computers by flipping switches on the front panel.

Like a good programming language, a good harness offers a better affordance for getting stuff done.

Even if we put correctness aside, tooling that saves time and tokens is going to be very valuable.

madeofpalk • yesterday at 4:52 PM

Isn't 'the harness' essentially just prompting?

It's completely understandable that prompting in better/more efficient means would produce different results.

➕ show 1 reply

manbash • yesterday at 4:06 PM

The models generalized "understanding" and "reasoning" is the real myth that makes us take a step back and offload the process deterministic computing and harnesses.

alt Hacker News

Replies