I think you are being ridiculous. Tampering with an LLMs pretraining is a difficult undertaking. There is plenty of evidence that training a model to walk the party line leaves it less capable than if it weren't.
It's not very subtle manipulation either; ask qwen of Taiwan is a part of China in German and in English and only the English answer will be party-approved.
I think you are being ridiculous. Tampering with an LLMs pretraining is a difficult undertaking. There is plenty of evidence that training a model to walk the party line leaves it less capable than if it weren't.
It's not very subtle manipulation either; ask qwen of Taiwan is a part of China in German and in English and only the English answer will be party-approved.