Same way you’d do it without AI. Record sample data, test against that, generate more data, test IRL, record more data, loop until it’s good enough.