You have to fight to get agents to write tests in my experience. It can be done, but they don't. I've yet to figure out how get any any agent to use TDD - that is write a test and then verify it fails - once in a while I can get it to write one test that way, but it then writes far more code to make it pass than the test justifies and so is still missing coverage of important edge cases.
I have TDD flow working as a part of my tasks structuring and then task completion. There are separate tasks for making the tests and for implementing. The agent which implements is told to pick up only the first available task, which will be “write tests task”, it reliably does so. I just needed to add how it should mark tests as skipped because it’s been conflicting with quality gates.
Instead of fighting agents to write tests, what if the testing agent is the product itself? That's the idea behind Autonoma (https://github.com/autonoma-ai/autonoma), AI agents that do E2E testing by exploring your app like a real user.