There are some situations with tricky lifetime issues that are almost impossible to write without this pattern. Trying to break code out into functions would force you to name all the types (not even possible for closures) or use generics (which can lead to difficulties specifying all required trait bounds), and `drop()` on its own is of no use since it doesn't effect the lexical lifetimes.
More significantly the new variables x and y in the block are Drop'd at the end of the block rather than at the end of the function. This can be significant if:
- Drop does something, like close a file or release a lock, or
- x and y don't have Send and/or Sync, and you have an await point in the function or are doing multi-threaded stuff
This is why you should almost always use std::sync::Mutex rather than tokio::sync::Mutex. std's Mutex isn't Sync/Send, so the compiler will complain if you hold it across an await. Usually you don't want mutex's held across an await.
Our codebase is full of this pattern and I love it. Every time I get clean up temporaries and expose an immutable variable outside of the setup, makes me way too happy.
A lot of the time it looks like this:
let config = {
let config = get_config_bytes();
let mut config = Config::from(config);
config.do_something_mut();
config.do_another_mut();
config
};Blocks being expressions is one of the features of the Rust language I really love (and yes I know it's not something Rust invented, but it's still not in many other popular languages).
That last example is probably my biggest use of it because I hate having variables being unnecessarily mutable.
You can also de-mut-ify a variable by simply shadowing it with an immutable version of itself:
let mut data = foo(); data.mutate(); let data = data;
May be preferable for short snippets where adding braces, the yielded expression, and indentation is more noise than it's worth.
For those who might not have seen it, you can use this to make a `while` act like a `do-while` loop by putting the entire body in the boolean clause (and then putting an empty block for the actual body):
// double the value of `x` until it's at least 10
while { x = x * 2; x < 10 } {}
This isn't something that often will end up being more readable compared to another way to express it (e.g. an unconditional `loop` with a manual `break`, or refactoring the body into a separate function to be called once before entering the loop), but it's a fun trick to show people sometimes.I love that this is part of the syntax.
I typically use closures to do this in other languages, but the syntax is always so cumbersome. You get the "dog balls" that Douglas Crockford always called them:
``` const config = (() => { const raw_data = ...
...
return compiled;
})()'const result = config.whatever;
// carry on
return result; ```
Really wish block were expressions in more languages.
Block expression https://doc.rust-lang.org/reference/expressions/block-expr.h...
Also in Kotlin, Scala, and nim.
Not mentioned in the article but kinda neat: you can label such a block and break out of it, too! The break takes an argument that becomes the value of the block that is broken out of.
GCC adds similar syntax as an extension to C: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html
It's used all throughout the Linux kernel and useful for macros.
This seems like a great way to group semantically-related statements, reduce variable leakage, and reduce the potential to silently introduce additional dependencies on variables. Seems lighter weight (especially from a cognitive load perspective) than lambdas. Appropriate for when there is a single user of the block -- avoids polluting the namespace with additional functions. Can be easily turned into a separate function once there are multiple users.
From the article:
Here’s a little idiom that I haven’t really seen discussed
anywhere, that I think makes Rust code much cleaner and
more robust.
I don’t know if there’s an actual name for this idiom; I’m
calling it the “block pattern” for lack of a better word.
This idiom has been discussed and codified in various languages for many years. For example, Scala has supported the same thusly: val foo: Int = {
val one = 1
val two = 2
one + two
}
Java (the language) has also supported[0] similar semantics.Good to see Rust supports this technique as well.
0 - https://docs.oracle.com/javase/tutorial/java/javaOO/initial....
This is also somewhat common in c++ with immediate-invoked lambdas
This is one of those natural consequences of "everything is an expression" languages that I really like! I like more explicit syntax like Zig's labelled blocks, but any of these are cool.
Try this out, you can actually (technically) assign a variable to `continue` like:
let x = continue;
Funnily enough, one of the few things that are definitely always a statement are `let` statements! Except, you also have `let` expressions, which are technically different, so I guess that's not really a difference at all.
I often employ this pattern in Ruby using `.tap` or a `begin` block.
It barely adds any functionality but it's useful for readability because of the same reasons in the OP.
It helps because I've been bitten by code that did this:
setup_a = some_stuff
setup_b = some_more_stuff
i_think_this_is_setup = even_more_stuff
the_thing = run_setup(setup_a, setup_b, i_think_this_is_setup)
That's all fine until later on, probably in some obscure loop, `i_think_this_is_setup` is used without you noticing.Instead doing something like this tells the reader that it will be used again:
i_think_this_is_setup = even_more_stuff
the_thing = begin
setup_a = some_stuff
setup_b = some_more_stuff
run_setup(setup_a, setup_b, i_think_this_is_setup)
end
I now don't mentally have to keep track of what `setup_a` or `setup_b` are anymore and, since the writer made a conscious effort not to put it in the block, you will take an extra look for it in the outer scope.We do this via run in TS:
export const run = <T>(f: () => T): T => {
return f();
};This is Rusts OCaml roots showing :)
> This is why I generally avoid C’s “bottom-up” strategy for organizing code.
I think the author misunderstood something....
When would you use the block pattern vs creating a new function?
Reminds of Brian Wills OOP rant video from 2016. He advocates exactly for this pattern: https://www.youtube.com/watch?v=QM1iUe6IofM&t=2235s
The first example given is not at all convincing. Its is clear as the sky that loading the config file should be be a separate function of its own. Coupling sending HTTP requests with it makes no sense.
The second example "erasure of mutability" makes more sense. But this effectively makes it a Rust-specific pattern.
In Rust everything is an expression, yes.
I use this all the time. It's features like these that sell Rust for me honestly; even if you wrapped your whole program in `unsafe` it would still be a massively better language than C++ or C.
I feel like indentation is a really useful structural signal that has been hijacked, in C-family languages, by unnecessarily strict conventions and most recently by autoformatters, to correspond exclusively to language structure, when it could be used for semantic structure as well (or occasionally instead).
Much of the value of this block pattern is that it makes the scope of the intermediate variables clear, so that you have no doubt that you don’t need to keep them in mind outside that scope.
But it’s also about logical grouping of concepts. And that you can achieve with simple ad hoc indentation:
fn foo(cfg_file: &str) -> anyhow::Result<()> {
// Load the configuration from the file.
// Cached regular expression for stripping comments.
static STRIP_COMMENTS: LazyLock<Regex> = LazyLock::new(|| {
RegexBuilder::new(r"//.*").multi_line(true).build().expect("regex build failed")
});
// Load the raw bytes of the file.
let raw_data = fs::read(cfg_file)?;
// Convert to a string to the regex can work on it.
let data_string = String::from_utf8(&raw_data)?;
// Strip out all comments.
let stripped_data = STRIP_COMMENTS.replace(&config_string, "");
// Parse as JSON.
let config = serde_json::from_str(&stripped_data)?;
// Do some work based on this data.
send_http_request(&config.url1)?;
send_http_request(&config.url2)?;
send_http_request(&config.url3)?;
Ok(())
}
(Aside: that code is dreadful. None of the inner-level comments are useful, and should be deleted (one of them is even misleading). .multi_line(true) does nothing here (it only changes the meanings of ^ and $; see also .dot_matches_new_line(true)). There is no binding config_string (it was named data_string). String::from_utf8 doesn’t take a reference. fs::read_to_string should have been used instead of fs::read + String::from_utf8. Regex::replace_all was presumably intended.)It might seem odd if you’re not used to it, but I’ve been finding it useful for grouping, especially in languages that aren’t expression-oriented. Tooling may be able to make it foldable, too.
I’ve been making a lightweight markup language for the last few years, and its structure (meaning things like heading levels, lists, &c.) has over time become almost entirely indentation-based. I find it really nice. (AsciiDoc is violently flat. reStructuredText is mostly indented but not with headings. Markdown is mostly flat with painfully bad and footgunny rules around indentation.)
—⁂—
A related issue. You frequently end up with multiple levels of indentation where you really only want one. A simple case I wrote yesterday in Svelte and was bothered by:
$effect(() => {
if (loaded) {
… lots of code …
}
});
In some ancient code styles it might have been written like this instead: $effect(() => { if (loaded) {
… lots of code …
} });
Not the prettiest due to the extra mandatory curlies, but it’s fine, and the structure reasonable. In Rust it’s nicer: effect(|| if loaded {
… lots of code …
});
But rustfmt would insist on returning it to this disappointment: effect(|| {
if loaded {
// … lots of code …
}
});
Perhaps the biggest reason around normalising indentation and brace practice was bugs like the “goto fail” one. I think there’s a different path: make the curly braces mandatory (like Rust does), and have tooling check that matching braces are at the same level of indentation. Then the problem can’t occur. Once that’s taken care of, I really see no reason not to write things more compactly, when you decide it is nicer, which I find quite frequently compared with things like rustfmt.I would like to see people experiment with indentation a bit more.
—⁂—
One related concept from Microsoft: regions. Cleanest in C♯, `#region …` / `#endregion` pragmas which can introduce code folding or outlining or whatever in IDEs.
Scala has this too, it's extremely useful
I think the technique is important to have in your vocabulary, but I think the examples given are a weak sell.
In the example given, I would have preferred to extract to a method—-what if I want to load the config from somewhere else? And perhaps the specific of strip comments itself could have been extracted to a more-semantically-aptly named post-processing method.
I see the argument that when extracted to a function, that you don’t need to go hunting for it. But if we look at the example with the block, I still see a bunch of detail about how to load the config, and then several lines using it. What’s more important in that context—-the specifics of the loading of config, or the specifics of how requests are formed using the loaded config?
The fact that you need to explain what’s happening with comments is a smell. Properly named variables and methods would obviate the need for the comments and would introduce semantic meaning thru names.
I think blocks are useful when you are referencing a lot of local variables and also have fairly localized meaning within the method. For example, you can write a block to capture a bunch of values for logging context—-then you can call that block in every log line to get a logging context based on current method state. It totally beats extracting a logging context method that consumes many variables and is unlikely to be reused outside of the calling method, and yet you get delayed evaluation and single point of definition for it.
So yes to the pattern, but needs a better example.
This is a great addition to the best patterns and practices in Rust. Worth noting and using. In JavaScript there's the proposal of "do expressions" which accomplish the same.
Obligatory use: it’s a block I guess
Voluntary use: I know this one. It’s a pattern now.
I have one better: the try block pattern.
https://doc.rust-lang.org/beta/unstable-book/language-featur...