I wouldn’t be so sure about the LLM not helping. The LLM doesn’t need to know about the edge cases itself. Instead, you’d be relying on other client implementations knowing about the edge cases and the LLM finding the info in those code bases. Those other implementations have probably been through similar test cycles, so using an LLM to compare those implementations to yours isn’t a bad option.