As others have pointed out, humans train on existing codebases as well. And then use that knowledge to build clean room implementations.
No they don't. One team meticulously documents and specs out what the original code does, and then a completely independent team, who has never seen the original source code, implements it.
Otherwise it's not clean-room, it's plagiarism.
What they don't do is read the product they're clean-rooming. That's kinda disqualifying. Impossible to know if the GCC source is in 4.6's training set but it would be kinda weird if it wasn't.
Not the same.
I have read nowhere near as much code (or anything) as what Claude has to read to get to where it is.
And I can write an optimizing compiler that isn't slower than GCC -O0
If that's what clean room means to you, I do know AI can definitely replace you. As even ChatGPT is better than that.
(prompt: what does a clean room implementation mean?)
From ChatGPT without login BTW!
> A clean room implementation is a way of building something (usually software) without copying or being influenced by the original implementation, so you avoid copyright or IP issues.
> The core idea is separation.
> Here’s how it usually works:
> The basic setup
> Two teams (or two roles):
> Specification team (the “dirty room”)
> Looks at the original product, code, or behavior
> Documents what it does, not how it does it
> Produces specs, interfaces, test cases, and behavior descriptions
> Implementation team (the “clean room”)
> Never sees the original code
> Only reads the specs
> Writes a brand-new implementation from scratch
> Because the clean team never touches the original code, their work is considered independently created, even if the behavior matches.
> Why people do this
> Reverse-engineering legally
> Avoid copyright infringement
> Reimplement proprietary systems
> Create open-source replacements
> Build compatible software (file formats, APIs, protocols)
I really am starting to think we have achieved AGI. > Average (G)Human Intelligence
LMAO
That’s the opposite of clean-room. The whole point of clean-room design is that you have your software written by people who have not looked into the competing, existing implementation, to prevent any claim of plagiarism.
“Typically, a clean-room design is done by having someone examine the system to be reimplemented and having this person write a specification. This specification is then reviewed by a lawyer to ensure that no copyrighted material is included. The specification is then implemented by a team with no connection to the original examiners.”