The reason is that the post you link to is overly simplistic. The only reason why Simon's exper...

akie • yesterday at 8:23 AM • 2 replies • view on HN

The reason is that the post you link to is overly simplistic. The only reason why Simon's experiment works is because there is a pre-existing language agnostic testing framework of 9000 tests that the agent can hold itself accountable to. Additionally, there is a pre-existing API design that it can reuse/reappropriate.

These two preconditions don't generally apply to software projects. Most of the time there are vague, underspecified, frequently changing requirements, no test suite, and no API design.

If all projects came with 9000 pre-existing tests and fleshed-out API, then sure, the article you linked to could be correct. But that's not really the case.

Replies

jillesvangurp • yesterday at 9:14 AM

If you start with some working software, you could make an LLM generate a lot of tests for the existing functionality and ensure they pass against the existing software and have excellent test coverage. Generating tests and specifications from existing software is relatively easy. It's very tedious to do manually but LLMs excel at that type of job.

Once you have that, you port over the tests to a new language and generate an implementation that passes all those tests. You might want to do some reviews of the tests but it's a good approach. It will likely result in bug for bug compatible software.

Where it gets interesting is figuring out what to do with all the bugs you might find along the way.

baq • yesterday at 10:09 AM

> pre-existing language agnostic testing framework of 9000 tests

if there exists a language specific test harness, you can ask the LLMs to port it before porting the project itself.

if it doesn't, you can ask the LLM to build one first, for the original project, according to specs.

if there are no specs, you can ask the LLM to write the specs according to the available docs.

if there are no docs, you can ask the LLM to write them.

if all the above sounds ridiculous, I agree. it's also effective - go try it.

(if there is no source, you can attempt to decompile the binaries. this is hard, but LLMs can use ghidra, too. this is probably unreasonable and ineffective today, though.)

➕ show 1 reply

alt Hacker News

Replies