Was this done by manually reviewing commit messages? I think it would be interesting/useful to have a tool that could use some basic heuristics about LLM generated code to detect code-blobs even if they are not explicitly called out in a commit message.
Apparently, though not very carefully. The "particularly large LLM generated code churn" in the ram library, for example, is the LLM being used to simply git-revert a change that was not originally done by an LLM.
when i was reading this i thought of writing some quick and dirty cli tool that checks commit co-authors. wouldn't be perfect, but would eliminate a good chunk of low hanging fruit.
Just like with writing, any kind of AI detection is going to be inaccurate to the point of snake oil.
LLM detection in writing is basically today's polygraph test pseudoscience. There was a blog a while ago where someone fed classic literature into one and it was detected as probably AI.
The diff of the linked commit in git is completely trivial, clearly it just got tagged because of the signoff in the commit message: https://github.com/git/git/commit/d7971544fe17378f44f4998301...
I would be surprised if there is no LLM-assisted code in there prior to this commit, this is just the first where the author chose to disclose it.