You need to have the AI write an increasingly detailed design and plan about what to code, assess the plan and revise it incrementally, then have it write code as planned and assess the code. You're essentially guiding the "Thinking" the AI would have to perform anyway. Yes, it takes more time and effort (though you could stop at a high-level plan and still do better than not planning at all), but it's way better than one-shotted vibe code.
The problem is those plans become huge. Now I have to review a huge plan and the comparatively short code change.
This works but still lacks most context around previous tasks and it isn’t trivial to get it to take that into account.
TDD is a great way to start the plan, stubbing things it needs to achieve with E2E tests being the most important. You still need to read through them so it won't cheat, but the codebase will be much better off with them than without them.