> I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that.
Personally, I want a viral (GPL-style) license that explicitly prohibits use of code for LLM training/tuning purposes — with the asterisk that while current law might view LLM training as fair use, this may not be the case forever, and blatant disregard of the terms of the license should make it easier for me to sue offenders in the future.
Alternatively, this could be expressed as: the output of any LLM trained on this code must retain this license.