logoalt Hacker News

The economics of software teams: Why most engineering orgs are flying blind

410 pointsby kiyanwanglast Monday at 5:45 AM284 commentsview on HN

Comments

jillesvangurplast Monday at 7:57 AM

If you want to understand economics, I recommend watching some of Don Reinertsen's videos on Lean 2.0. He goes into a few concepts quite deeply that are quite intuitive.

Cost of delay: calculating the cost of delaying by a few weeks in terms of lost revenue (you aren't shipping whatever it is you are building), total life value of the product (your feature won't be delivering value forever), extra cost in staffing. You can slap a number on it. It doesn't have to be a very accurate number. But it will give you a handle on being mindful that you are delaying the moment where revenue is made and taking on team cost at the cost of other stuff on your backlog.

Option value: calculating the payoff for some feature you add to your software as having a non linear payoff. It costs you n when it doesn't work out and might deliver 10*n in value if it does. Lean 1.0 would have you stay focused and toss out the option for that potential 10x payoff. But if you do a bit of math, there probably is a lot of low hanging fruit that you might want to think about picking because it has a low cost and a potential high payoff. In the same way variability is a good thing because it gives you the option to do something with it later. A little bit of overengineering can buy you a lot of option value. Whereas having tunnel vision and only doing what was asked might opt you out of all that extra value.

A bad estimation is better than no estimation: even if you are off by 3x, at least you'll have a number and you can learn and adapt over time. Getting wildly varying estimates from different people means you have very different ideas about what is being estimated. Do your estimates in time. Because that allows you to slap a dollar value on that time and do some cost calculations. How many product owners do you know that actually do that or even know how to do that?

Don't run teams at 100% capacity. Work piles up in queues and causes delays when teams are pushed hard. The more work you pile on the worse it gets. Worse, teams start cutting corners and take on technical debt in order to clear the queue faster. Any manufacturing plant manager knows not to plan for more than 90% capacity. It doesn't work. You just end up with a lot of unfinished work blocking other work. Most software managers will happily go to 110%. This causes more issues than it solves. Whenever you hear some manager talking about crunch time, they've messed up their planning.

Stretching a team like that will just cause cycle times to increase when you do that. Also, see cost of delay. Queues aren't actually free. If you have a lot of work in progress with inter dependencies, any issues will cause your plans to derail and cause costly delays. It's actually very risky to do that if you think about it like that. If you've ever been on a team that seemingly doesn't get anything done anymore, this might be what is going on.

I like this back of the envelope math; it's hard to argue with.

I used to be a salaried software engineer in a big multinational. None of us had any notion of cost. We were doing stuff that we were paid to do. It probably cost millions. Most decision making did not have $ values on them. I've since been in a few startups. One where we got funded and subsequently ran out of money without ever bringing in meaningful revenue. And another one that I helped bootstrap where I'm getting paid (a little) out of revenue we make. There's a very direct connection between stuff I do and money coming in.

show 1 reply
carlm42last Monday at 11:52 AM

Measuring a platform team's productivity in pure "hours saved" is missing a huge point: reliability. If your platform prevents even one outage every month, how much business value and capital are saved? That analysis is utterly absent from this article. It also seems to focus on "LLMs make code cheap" which is a half truth: LLMs make (so far) easy or messy code cheap. I'd bet that there too the analysis on reliability/stability is missing from the author's perspective.

rockemsockemlast Monday at 3:43 PM

Wow that article made a hard right turn about halfway through.

"Most organizations improperly account for engineering teams and incorrectly consider both code and team growth to be assets when in fact they increase complexity..... but LLMs can fix all of this"

Wtf?

Measuring things that actually matter is a great way to improve clarity on a team, you can probably just stop reading this article at the halfway point.

EDIT:

Specifically this paragraph is insane

"The obvious objection is that code produced at that speed becomes unmanageable, a liability in itself. That is a reasonable concern, but it largely applies when agents produce code that humans then maintain. Agentic platforms are being iterated upon quickly, and for established patterns and non-business-critical code, which is the majority of what most engineering organizations actually maintain, detailed human familiarity with the codebase matters less than it once did. A messy codebase is still cheaper to send ten agents through than to staff a team around. And even if the agents need ten days to reason through an unfamiliar system, that is still faster and cheaper than most development teams operating today. The liability argument holds in a human-to-human or agent-to-human world. In an agent-to-agent world, it largely dissolves."

tgdnlast Monday at 7:24 AM

I get "This site can’t be reached"

show 1 reply
EastLondonCoderlast Monday at 11:01 AM

I think the article make some great points, however this part is not even wrong:

"The obvious objection is that code produced at that speed becomes unmanageable, a liability in itself. That is a reasonable concern, but it largely applies when agents produce code that humans then maintain. Agentic platforms are being iterated upon quickly, and for established patterns and non-business-critical code, which is the majority of what most engineering organizations actually maintain, detailed human familiarity with the codebase matters less than it once did. A messy codebase is still cheaper to send ten agents through than to staff a team around. And even if the agents need ten days to reason through an unfamiliar system, that is still faster and cheaper than most development teams operating today. The liability argument holds in a human-to-human or agent-to-human world. In an agent-to-agent world, it largely dissolves."

LLMs are not conscious, that means left on their own devices they will drift. I think the single most important issue when working with LLMs is that they write text without a layer that are aware what's actually being written. That state can be present in humans as well, like for example in sleepwalking.

Everyone who's tried to to complete vibe coding a somewhat larger project knows that you only get to a certain level of complexity until the model stops being able to reason about the code effectively. It starts to guess why something is not working and cannot get out of that state until guided by a human.

That is not new state in the field, I believe all programmers has at points in their career come across code that's been written with developers needing to get over a hard deadline with the result of a codebase that cannot effectively be modified.

I think for a certain subsets of programming projects some projects could possibly be vibe coded as in that code can be merged without human understanding. But it has to be very straightforward crud apps. In almost everything else you will get stopped by slop.

I suspect that the future of our profession will shift from writing code to reading code and to apply continuous judgement on architecture working together with LLMs. Its also worth keeping in mind that you cannot assign responsibility to an LLM and most human organization requires that to work.

thornewolflast Monday at 10:59 PM

more ai articles on my front page :(

danpalmerlast Monday at 7:41 AM

> even if the agents need ten days to reason through an unfamiliar system, that is still faster and cheaper than most development teams operating today

Citation needed. A human engineer can grok a lot in 10 days, and an agent can spend a lot of tokens in 10 days.

bsenftnerlast Monday at 11:42 AM

Yet another essay completely missing the point, and an audience that misses it as well. All these organizations fly blind because nowhere in any technology or science education is there any emphasis on effective communications, conveying understanding, solving disagreements with analysis and the best of both perspectives... none of these critical communication skills are taught to the very people that most need them. It's a wonder our civilization functions at all.

toniantunovilast Monday at 7:04 PM

[dead]

sanghyunplast Monday at 12:13 PM

[dead]

ensocodelast Monday at 7:21 AM

[dead]

dfhvneoienolast Monday at 9:48 AM

[dead]

gtsoplast Monday at 8:14 AM

[dead]

dsignlast Monday at 7:24 AM

[dead]

lynx97last Monday at 7:12 AM

Using ‘blind’ to mean ‘ignorant’ is like using any disability label as a synonym for ‘bad’—it turns a real condition into an insult.

show 2 replies