"Did the vehicle just crash" has a short feedback loop, very amenable to RL. "Did this product strategy tank our earnings/reputation/compliance/etc" can have a much longer, harder to RL feedback loop.
But maybe not that much longer; METR task length improvement is still straight lines on log graphs.
The AI has read all the business books, blogs and stories.
Unless your CEO is Steve Jobs, it's hard to imagine it being much worse than your average pointy haired boss.