In the scenario I'm hypothesizing, why would anyone need to "check" or "test" its work? What chess players are checking to make sure Stockfish made the "right" move? What determines whether or not it's "right" is if Stockfish made it.
There are clear win conditions in chess. There are not for most software engineering tasks. If you don't get this, it's probably a safe bet that you're not an engineer.
Your post sent me down a rabbit hole reading about the history of computers playing chess. Notable to me is that AI advocates were claiming that a computer would be able to beat the best human chess players within 10 years as far back as the 1950s. It was so long ago they had to clarify they were talking about digital computers.
Today I learned that AI advocates being overly optimistic about its trajectory is actually not a new phenomenon - it's been happening for more than twice my lifetime.