logoalt Hacker News

chrismorganyesterday at 6:36 AM0 repliesview on HN

I feel like indentation is a really useful structural signal that has been hijacked, in C-family languages, by unnecessarily strict conventions and most recently by autoformatters, to correspond exclusively to language structure, when it could be used for semantic structure as well (or occasionally instead).

Much of the value of this block pattern is that it makes the scope of the intermediate variables clear, so that you have no doubt that you don’t need to keep them in mind outside that scope.

But it’s also about logical grouping of concepts. And that you can achieve with simple ad hoc indentation:

  fn foo(cfg_file: &str) -> anyhow::Result<()> {
      // Load the configuration from the file.
          // Cached regular expression for stripping comments.
          static STRIP_COMMENTS: LazyLock<Regex> = LazyLock::new(|| {
              RegexBuilder::new(r"//.*").multi_line(true).build().expect("regex build failed")
          });

          // Load the raw bytes of the file.
          let raw_data = fs::read(cfg_file)?;

          // Convert to a string to the regex can work on it.
          let data_string = String::from_utf8(&raw_data)?;

          // Strip out all comments.
          let stripped_data = STRIP_COMMENTS.replace(&config_string, "");

          // Parse as JSON.
          let config = serde_json::from_str(&stripped_data)?;

      // Do some work based on this data.
          send_http_request(&config.url1)?;
          send_http_request(&config.url2)?;
          send_http_request(&config.url3)?;

      Ok(())
  }
(Aside: that code is dreadful. None of the inner-level comments are useful, and should be deleted (one of them is even misleading). .multi_line(true) does nothing here (it only changes the meanings of ^ and $; see also .dot_matches_new_line(true)). There is no binding config_string (it was named data_string). String::from_utf8 doesn’t take a reference. fs::read_to_string should have been used instead of fs::read + String::from_utf8. Regex::replace_all was presumably intended.)

It might seem odd if you’re not used to it, but I’ve been finding it useful for grouping, especially in languages that aren’t expression-oriented. Tooling may be able to make it foldable, too.

I’ve been making a lightweight markup language for the last few years, and its structure (meaning things like heading levels, lists, &c.) has over time become almost entirely indentation-based. I find it really nice. (AsciiDoc is violently flat. reStructuredText is mostly indented but not with headings. Markdown is mostly flat with painfully bad and footgunny rules around indentation.)

—⁂—

A related issue. You frequently end up with multiple levels of indentation where you really only want one. A simple case I wrote yesterday in Svelte and was bothered by:

  $effect(() => {
      if (loaded) {
          … lots of code …
      }
  });
In some ancient code styles it might have been written like this instead:

  $effect(() => { if (loaded) {
      … lots of code …
  } });
Not the prettiest due to the extra mandatory curlies, but it’s fine, and the structure reasonable. In Rust it’s nicer:

  effect(|| if loaded {
      … lots of code …
  });
But rustfmt would insist on returning it to this disappointment:

  effect(|| {
      if loaded {
          // … lots of code …
      }
  });
Perhaps the biggest reason around normalising indentation and brace practice was bugs like the “goto fail” one. I think there’s a different path: make the curly braces mandatory (like Rust does), and have tooling check that matching braces are at the same level of indentation. Then the problem can’t occur. Once that’s taken care of, I really see no reason not to write things more compactly, when you decide it is nicer, which I find quite frequently compared with things like rustfmt.

I would like to see people experiment with indentation a bit more.

—⁂—

One related concept from Microsoft: regions. Cleanest in C♯, `#region …` / `#endregion` pragmas which can introduce code folding or outlining or whatever in IDEs.