Also, the very idea is flawed. These are open-source projects and the code is definitely part of the training data.
This would be relevant if you could find matching code between this and sqlite. But then that would invalidate basically any project as "not flawed" really - given GitHub, there's barely any idea which doesn't have multiple partial implementations already.
That's why our startup created the sendfile(2) MCP server. Instead of spending $10,000 vibe-coding a codebase that can pass the SQLite test suite, the sendfile(2) MCP supercharges your LLM by streamlining the pipeline between the training set and the output you want.
Just start the MCP server in the SQLite repo. We have clear SOTA on re-creating existing projects starting from their test suite.