Hi there,
To be fair, achieving 100% accuracy is something even humans don't do. I don't think this is about a system just asking an AI if something is right or wrong. The "judge" isn't another AI flipping a coin, it’s a code validator based on mathematical forms or pre established rules.
For example, if the agent makes a money transfer, the judge enters the database and validates that the number is exact. This is where we are merging AI intelligence with the security of traditional, "old school" code. Getting this close to 100% accuracy is already a huge deal. It’s like having three people reviewing an invoice instead of just one, it makes it much harder for an error to occur.
Regarding the cost, sure, the AI might cost a bit more because of all these extra validations. But if spending one dolar in tokens saves a company from losing five hundred dollar, due to an accounting error, the system has already paid for itself. It’s an investment, not a cost. Plus, this tighter level of control helps prevent not just errors, but also internal fraud and external irregularities. It’s a layer of oversight that pays off.
Best regards