How true is this? How does a regulated industry confirm the model itself wasn't trained with malicious intent?
Why would it matter if the model is trained with malicious intent? It's a pure function. The harness controls security policies.
Why would it matter if the model is trained with malicious intent? It's a pure function. The harness controls security policies.