To be more specific, let's say I can write an e2e test on an actual pre-prod environment, or I can invest much development and ongoing maintenance to develop stub responses so that the test can run before submit in a partial system. How much is "shifting left" worth versus investing in speeding up the deployment pipeline and fast flag rollout and monitoring?
Nobody I've worked with can ever quantify the ROI for elaborate take test environments, but somebody made an okr so there you go. Far be it we follow actual research done on modern software... http://dora.dev
Shift-left comes from the world before everyone was selling services. In that world shipping service packs was the way to fix problems discovered after a release, and releases were years apart.
From a QA perspective, I greatly regret that the world of infrequent releases is mostly gone. There are few kinds of products that still hold onto the old strategy, but this is a dying art.
I see the world of services with DevOps, push on green etc. strategies as a kind of fast-food of software development. A kind of way of doing things that allows one to borrow from the future self by promising to improve the quality eventually, but charging for that future improved quality today.
There are products where speeding the rollout is a bad idea. Anything that requires high reliability is in that category. And the reason is that highly reliable systems need to collect mileage before being released. For example, in storage products, it's typical to have systems run for few months before they are "cleared for release". Of course, it's possible to continue development during this time, but it's a dangerous time for development because at any moment the system can be sent back to developers, and they would have to incorporate their more recent changes into the patched system when they restart the development process. I.e. a lot of development effort can be potentially wasted before the system is sent out to QA and the actual release. And, to amortize this waste, it's better to release less frequently. It's also better to approach the time when the system is handed to QA with a system already well-tested, as this will minimize the back-and-forth between the QA and the development -- and that's the problem shift-left was intended to solve.
NB. Here's another, perhaps novel thought for the "push on green" people. Once it was considered a bad idea for the QA to be aware of implementation details. Testing was seen as an experiment, where the QA were the test subjects. This also meant that exposing the QA to the internal details of the systems or the rationale that went into building it would "spoil" the results. In such a world, allowing the QA to see a half-baked system would be equivalent to exposing them to the details of the system's implementation, thus undermining their testing effort. The QA were supposed to receive the technical documentation for the system and work from it, trying to use the system as documented.
No argument, but there can be limitations on how much you can speed up the deployment pipeline. In particular, the article is about integrated circuit development (actually about systems made of many ICs), where a “production deployment” takes months and many, many millions of dollars, and there’s not much you can do about it.
I heard a story decades ago about a software team that got a new member transferred in from the IC design department. The new engineer checked in essentially zero bugs. The manager asked what the secret was, and the new engineer said “wait, we’re allowed to have bugs?”
In fact, in my experience, these elaborate test environments and procedures cripple products.
I'm firmly of the opinion that if a test can't be run completely locally then it shouldn't be run. These test environments can be super fragile. They often rely on a symphony of teams ensuring everything is in a good state all the time. But, what happens more often than not, is one team somewhere deploys a broken version of their software to the test environment (because, of course they do) in order to run their fleet of e2e tests. That invariably ends up blowing up the rest of the org depending on that broken software and heaven help you if the person that deployed did it at 5pm and is gone on vacation.
This rippling failure mode happens because it's easier to write e2e tests which depend on a functional environment than it is to write and maintain mock services and mock data. Yet the mock services and data are precisely what you need to ensure someone doesn't screw up the test environment in the first place.