> These things weren't clear to me the first 10 years, but after 30 years, I've made th...

zahlman • yesterday at 5:38 PM • 0 replies • view on HN

> These things weren't clear to me the first 10 years, but after 30 years, I've made those conclusions. I hope other old-timers can chime in on this.

'Sup, it's me, the "new dev". Except I, too, have been at it for decades, and I get more and more attached to short functions year over year. (You are correct about composition and about isolating state mutations. But short functions are tools that help me to do those things. Of course, it helps a ton to have functions as first-class objects. Function pointers are criminally underused in C codebases from what I've seen. They can be used for much more than just reinventing C++ vtables.)

People put numbers on their advice because they don't trust the audience to have good taste, or to have a sense of the scale they have in mind. Of course that has the downside that metrics become targets. When I see a number in this kind of advice, I kinda take it in two passes: understand what kind of limit is proposed (Over or under? What is being limited?) and then go back and consider the numeric ballpark the author has in mind. Because, yes, 70 lines of Python is not the same as 70 lines of C.

But I can scarcely even fathom ten lines in a Python function that I write nowadays. And I'm rather skeptical that "LOC needed to represent a coherent idea" scales linearly with "LOC needed to make a whole program work".

> Now you have 12 functions, scattered over multiple packages, and the order of things is all confused, you have to debug through to see where it goes. They're used exactly once, and they're only used as part of a long process. You've just increased the cognitive load of dealing with your product by a factor of 12. It's downright malignant.

Well, no, that isn't what happens at all.

First off, the files where new functions get moved, if they get moved at all, are almost certainly going to be in the same "package" (whatever that means in the programming language in use). The idea that it might be hard to find the implementation code for something not in the current file, is pretty close to being a problem unique to C and C++. And I'm pretty sure modern IDEs have no problem dealing with that anyway.

Second, it absolutely does not "increase the cognitive load by a factor of 12". In my extensive experience, the cognitive load is decreased significantly. Because now the functions have names; the steps in the process are labelled. Because now you can consider them in isolation — the code for the adjacent steps is far easier to ignore.

Why would you "have to debug through to see where it goes"? Again, the functions have names. If the process really is purely sequential, then the original function now reads like a series of function calls, each naming a step in the sequence. It's now directly telling you what the code does and how. And it's also directly telling you "where it goes": to the function that was called, and back.

You also no longer have to read comments interspersed into a longer code flow, or infer logical groupings into steps. You can consider each step in isolation. The grouping is already done for you — that's the point. And if you aren't debugging a problem, that implies the code currently works. Therefore, you don't need to go over the details all at once. You are free to dig in at any point that tickles your curiousity, or not. You don't have to filter through anything you aren't interested in.

(Notice how in the three paragraphs above, I give one-sentence descriptions in the first paragraph of individual advantages, and then dedicate a separate paragraph to expanding on each? That is precisely the same idea of "using short functions", applied to natural language. A single, long paragraph would have been fewer total words, but harder to read and understand, and less coherent.)

All of that said, you don't really debug code primarily by single-stepping through long functions, do you? I find problems by binary search (approximately, guided by intuition) with breakpoints and/or logging. And when the steps are factored out into helper functions, it becomes easier to find natural breakpoints in the "main" function and suss out the culprit.

Shorter functions absolutely do have the properties described in the quote. Almost definitionally so. Nobody really groks code on the level of dozens of individual statements. We know brains don't work like that (https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus...).

alt Hacker News