> However, given just the weights, we don't have the source
This is incorrect, given the definitions in the license.
> (Apache 2.0) "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
(emphasis mine)
In LLMs, the weights are the preferred form of making modifications. Weights are not compiled from something else. You start with the weights (randomly initialised) and at every step of training you adjust the weights. That is not akin to compilation, for many reasons (both theoretical and practical).
In general licenses do not give you rights over the "know-how" or "processes" in which the licensed parts were created. What you get is the ability to inspect, modify, redistribute the work as you see fit. And most importantly, you modify the work just like the creators modify the work (hence the preferred form). Just not with the same data (i.e. you can modify the source of chrome all you want, just not with the "know-how and knowledge" of a google engineer - the license can not offer that).
This is also covered in the EU AI act btw.
> General-purpose AI models released under free and open-source licences should be considered to ensure high levels of transparency and openness if their parameters, including the weights, the information on the model architecture, and the information on model usage are made publicly available. The licence should be considered to be free and open-source also when it allows users to run, copy, distribute, study, change and improve software and data, including models under the condition that the original provider of the model is credited, the identical or comparable terms of distribution are respected.
> In LLMs, the weights are the preferred form of making modifications.
No they aren't. We happen to be able to do things to modify the weights, sure, but why would any lab ever train something from scratch if editing weights was preferred?