It is absolutely fine to distill the IP of everyone else, but you'd be violating the TOS to distill ours :)
Is there a technical term for this phenomenon? Ladder pulling?
https://blog.google/innovation-and-ai/technology/safety-secu...
Would be nice if people published the prompts, thoughts and responses of the LLMs together with the code, in order to fight against these restrictions... Instead of just publishing the final result and talking vaguely about how they prompted the LLM in a Hacker news comment or Twitter thread
If LLMs are the new compilers those are the actual source code
Fine for me. Not for thee
It's utterly bonkers. Hopefully the model weights get leaked. Then we can claim it's public domain or, at the very least, distill it and then release it for free.
Bad for society
It takes billions of investments for infrastructure, and a high-paying, top-notch team for R&D and operations. Not just a bunch of torrents of pirated books. Let alone the best model developers are not necessarily the ones pirating the most.
It's funny that Google, Meta, TikTok, OnlyFans, PornHub, and many other lucrative businesses never open-source their core business software, and people just don't bother about it with that moral standard, simply because we don't need to pay for the service (paid by ads, actually). To me, that is the hypocrisy.
Yep. Demand open source approve licenses for LLM weights.
The Chinese apache 2.0 models might be censored, but at least they can’t sue you in the US for finding the censorship line.
OTOH, the US models are definitely censored, per TFA, and they’re making vague legal threats against anyone that encounters the censored edge of the model.