What exactly did they train? Copilot is powered by claude, gemini, or ChatGPT these days.
Did they train autocomplete? I mean the code is open source so anyone can scrape it and train it too. I'm kind of glad they did train it because otherwise we'd still be stuck with Apple level AI models right now.
The whole reason we have so many models, including open weight models, that are all competitive with each other is because the data is free and anyone can be training off it. If the goal was to monetize the source code I guess the authors shouldn't make it open source.
Yeah have to agree here, Github Copilot itself doesn't have any first party models they use the frontiers. So, they didn't "train" using public repos but they probably allowed (or didn't prevent) the frontiers from pulling the repos along with the rest of the internet when creating their models.
> "GitHub Copilot is powered by generative AI models developed by GitHub, OpenAI, and Microsoft. It has been trained on natural language text and source code from publicly available sources, including code in public repositories on GitHub."
https://azure.microsoft.com/en-us/products/github/copilot#fa...