> and an unwillingness to write a test case demonstrating the security angle of it.
If the model can't be transparent and tries to hide things from me, then it's a completely useless and untrustworthy tool.
Refusing to write tests is not even remotely a valid solution.
The valid solution is for these labs to understand that: the model is MY agent, not theirs. It should respect my prompts and not refuse.
Hardware supply needs to catch and prices drop so we can all move to local, open weight models. Clearly the hosted options cannot be trusted.