logoalt Hacker News

eschatologyyesterday at 9:49 AM2 repliesview on HN

Hmm

I don’t see the state file as a complete downside. It is very simple and very easy to understand. It makes it easy to tell or predict what terraform will do given the current state and desired state.

Its simpleness makes troubleshooting easier: the state files are easy to read and manipulate or repair in the event of a drift, mismatch, or botched provider update.

With the solution proposed it feels like the state becomes a black box I shouldn’t put my hands in. I wonder how the troubleshooting scenarios change with it.

Personally, I haven’t ran into the scaling issue described; at any given time there is usually only one entity working with the state file. We do use terragrunt for larger systems but it is manageable. ~1000 engineer org.


Replies

mdanielyesterday at 2:43 PM

> easy to tell or predict what terraform will do

predict is the operative word there, because Terraform is so disconnected from the underlying provider's mental model that it is the expression "no plan survives first contact with the enemy" made manifest

Now, I am one million percent open to the pushback that "well, that's a provider's problem" but I also can't easily tell if they are operating within the bounds of TF's mental model, or is it literally that every provider ever is just that lazy?

lawnchairyesterday at 11:07 AM

You are right that the simplicity of the state file is a strength and we do not want to lose that. One of our goals with Stategraph is to make state just as easy to inspect through both the command line and the UI.

Not every Terraform setup runs into scaling pain. The trouble tends to show up in larger repos with thousands of resources where teams share big chunks of infra. That is where global locks and full refreshes become a bottleneck and where we think graph semantics help.

show 1 reply