I would love to see someone do a longitudinal study of the incident/error rate of a canary container in prod that is managed by claude. Basically doing a control/experimental group to prove who does better the Humans or the AI?