logoalt Hacker News

jrochkind1today at 5:46 AM2 repliesview on HN

Anthropic wanted to put those restrictions in the contract. OpenAI said they'll just trust their own "guardrails" in the training, they don't need it in the contract. (I'm not sure I believe "guardrails" can prevent mass surveillance of civilians?)

Very gracious of OpenAI to say Anthropic should not be designated a supply chain risk after sniping their $200 million contract by being willing to contractually let the government do whatever they like without restrictions.


Replies

lostngroundtoday at 8:41 AM

It cannot really oversee this. If you can decompose a problem into individual steps that are not, in themselves, against the agent's alignment, it's certainly possible to have the aggregate do so.

Barbingtoday at 8:29 AM

>(I'm not sure I believe "guardrails" can prevent mass surveillance of civilians?)

Right, wouldn't they need a moderation layer that could, for example, fire if it analyzed & labeled too many banal English conversations?

They really gave training credit for guardtrails? I mean, it could perhaps reject prompts about designing social credit systems sometimes, but I can't imagine realistic mitigations to mass domestic surveillance generally.