Unfortunately from our experience tests don’t scale as well as code. First of all static tests are very brittle, you rely on selectors, need wait times and can’t really test a lot of dynamic content (think AI chats/interactions). Then it’s all the infrastructure around it: solving captchas, handling auth, handling email OTP (each of our agents has access to its own inbox) and handling video recording and screenshots. So with the traditional testing approach you end up mocking a lot of services. I highly recommend you to give it a try!