Isn't just the issue stemming simply from not using the right tool? When the stakes are high and you should be checking details, the right tools are grounded Ai solutions like nouswise and notebooklm and not the general purpose chatbots that almost everyone knows they might hallucinate. I also do believe that this use case is definitely a low hanging fruit to automat a lot of manual work but it comes with new requirements like transparency to help with verifying the responses.
> Isn't just the issue stemming simply from not using the right tool?
What suggests this judge was not using the very best chatbot?