This is incredibly funny to me, because that's (almost) exactly how some of our other tests work :D
(Almost: we have a full-featured command shell and just use that for testing)
Shell bindings: https://github.com/FRRouting/frr/blob/master/tests/ospf6d/te...
Input: https://github.com/FRRouting/frr/blob/master/tests/ospf6d/te...
Expected output: https://github.com/FRRouting/frr/blob/master/tests/ospf6d/te...
Absolutely agree this is a great option to have for some kinds of tests.
One reason to have a razor thin custom command shell (perhaps obeying similar conventions across data structures..) is that the parsing for such can be so fast/consistent that you can also use it for benchmarking/perf regression testing with a workload generator. You might also find & record "anomalous" workloads that way or write a little "translator" (some might say "compiler") from a log to such a workload or etc.
I have done this for data structures even as fast as integer-keyed hash tables (though in that hyper-fast case you might need to try to measure & subtract off parser-loop/IO dispatch overhead and/or need statistical testing for "actually even different at all", perhaps along the lines of https://github.com/c-blake/bu/blob/main/doc/tim.md).