If anyone is wondering what the test results look like, here is an example from my site: https://pub-1fbd8591bf7a40cea36fa130fb2ba6bc.r2.dev/playwrig...
I have these running in a CI/CD process, compare to previous commit. Results uploaded to R2. Few problems:
- Playwright regularly fails by timeout. This is flaky and go figure out what went wrong.
- You can do a matrix test (chrome/firefox/etc.) (mobile/tablet/etc.) but the problem is, you'll need to run these tests in parallel. The bare functional minimum is 16Gb vps with 4vcpu. For my test suite, it already take 20 minutes. If you want a larger matrix and have more pages, you'll be looking at a 64Gb with a dozen or so vpcus. That's hundreds of dollars a month...
- If you have an animation, it's a struggle to filter it out.
- From my knowledge, there is no "version slider" where you can go commit by commit and see how things changed.
- Playwright takes images and videos. These consumes a lot of data. Like Gbs of data for a few commits.
- Any of the managed solutions (like BrowserStack) costs hundreds of dollars.
Overall, I think it's great though a bit cumbersome to setup everything to work flawlessly and prevent from breaking every now and then. You can also do full flows (sigup-signin-do action-etc.. -> success/failure) which can test more than UI.
Thanks for the example of a Playwright report page. I agree that getting browser tests (not even just visual tests) to work reliably is considerable work. I built out a suite at work for a rather complex web application and it certainly looks easier than it is. A couple of notes:
- I disagree that you need a powerful VPS to run these tests, we run our suite once a day at midnight instead of on every commit. You still get most of the benefit for much cheaper this way.
- We used BrowserStack initially but stopped due to flakiness. The key to getting a stable suite was to run tests against a local nginx image serving the web app and wiremock serving the API. This way you have short, predictable latency and can really isolate what you're trying to test.