Anthropic wanted to put those restrictions in the contract. OpenAI said they'll just trust their own "guardrails" in the training, they don't need it in the contract. (I'm not sure I believe "guardrails" can prevent mass surveillance of civilians?)
Very gracious of OpenAI to say Anthropic should not be designated a supply chain risk after sniping their $200 million contract by being willing to contractually let the government do whatever they like without restrictions.
>(I'm not sure I believe "guardrails" can prevent mass surveillance of civilians?)
Right, wouldn't they need a moderation layer that could, for example, fire if it analyzed & labeled too many banal English conversations?
They really gave training credit for guardtrails? I mean, it could perhaps reject prompts about designing social credit systems sometimes, but I can't imagine realistic mitigations to mass domestic surveillance generally.
It cannot really oversee this. If you can decompose a problem into individual steps that are not, in themselves, against the agent's alignment, it's certainly possible to have the aggregate do so.