this is just advocating for a harness, which has been the focus (along with evals) for at least the last three months by pretty much anyone working with agents professionally or seriously