Stop Forwarding Errors, Start Designing Them

86 points • by andylokandy • last Sunday at 7:02 PM • 51 comments • view on HN

Comments

Unreadable due to lag when scrolling. How do you even manage that? Stutters happen on other pages but this was just a delay that was extremely annoying.

Rygian • last Sunday at 8:20 PM

Sorry for the small digression. It's on topic.

Just a few minutes ago, while copying 63 GB worth of pics and videos from my phone to my laptop, KDE forwarded me the error "File <hard to retain name.jpg> could not be opened. Retry, Ignore, Ignore all, Cancel".

This was around file 7000 out of 15000. The file transfer stopped until I made a choice.

As a user, what am I supposed to do with such a popup?

It seems like a very good example of "Eror Handling Without Purpose" as the article describes, but at user level.

Except that here, the audience is "a plain user who just dragged a folder to make a copy" and none of the four options (or even the act of stopping the file transfer until an answer is chosen) is actually meaningful for the user.

The "Putting It Together" for this scenario should look like: a non-modal section populates with "file <hard to retain name.jpg> failed due to reason; at the end of the file transfer you'll get a list with all the files that failed, and you'll have an option to retry them, navigate to their source position to double-check, and/or ignore".

➕ show 2 replies

bccdee • last Sunday at 10:46 PM

I'm not sure I like how they're trying to dynamically cast to an error type.

  Err(report) => {
      // For machines: find and handle the structured error
      if let Some(err) = find_error::<StorageError>(&report) {
          if err.status == ErrorStatus::Temporary {
               return queue_for_retry(report);
          }
          return Err(map_to_http_status(err.kind));
      }

They get it right elsewhere when they describe errors for machines as being "flat and actionable." `StorageError` is that, but the outer `Err(report)` is not. You shouldn't be guessing which types of error you might run into; you should be exhaustively enumerating them.

I'd rather have something like this:

  struct Exn<T> {
      trace: Trace,
      err: T,
  }
  
  impl<T> Exn<T> {
      #[track_caller]
      fn wrap<U: From<T>>(self, msg: String) -> Exn<U> {
          Exn {
              trace: self.trace.add_context(Location::caller(), msg),
              err: self.err.into(),
          }
      }
  }

That way your `err` field is always a structured error, but you still get a context trace. With a bit more tweaking, you can make the trace tree-shaped rather than linear, too, if you want.

I think actionable error types need to be exhaustively matchable, at least for any Rust error that you expect a machine to be handling. Details a human is interested in can be preserved at each layer by the trace, while details the machine cares about will be pruned and reinterpreted at every layer, so the machine-readable info is kept flat, relevant, and matchable.

➕ show 1 reply

dvogel • last Sunday at 9:03 PM

> But as a standard library abstraction, it’s too opinionated. It categorically excludes cases where sources form a tree: a validation error with multiple field failures, a timeout with partial results. These scenarios exist, and the standard trait offers no way to represent them.

This seems akin to complaining that the CPU core has only one instruction pointer. There is nothing preventing a struct implementing `Error` from aggregating other errors (such as validation results) and still exposing them via the `Error` trait. The fact of the matter is that the call stack is linear, so the interior node in the tree the author wants still needs to provide the aggregate error reporting that reflects the call stack that was lost with the various returns. Nothing about that error type implementing `Error` prevents it from also implementing another error reporting trait that reflects the aggregate errors in all of the underlying richness with which they were collected.

oncallthrow • last Sunday at 9:05 PM

This is interestingly somewhere where Go really shines, in my experience. Go has no requirement to wrap (or, indeed, even handle at all) errors; yet, despite this, Go codebases I've worked in almost always perform error handling properly (wrapping at each layer of the call stack, so it's easy to identify where an error occurred).

➕ show 4 replies

spion • yesterday at 1:15 AM

Great article. Really advances the thinking on error handling. Rust already has a head start compared to most other languages with Result, expect and anyhow (well, color_eyre and tracing), but there was indeed a missing piece tying together error handling "actionability" with "better than stack trace" context for the programmer.

With regards to context for the programmer, I still think ultimately tracing and color_eyre (see https://docs.rs/color-eyre/latest/color_eyre/) form a good-enough pair for service style applications, with tracing providing the missing additional context. But its nice to see a simpler approach to actionability.

Sytten • last Sunday at 9:02 PM

Exn looks very interesting, but to be actionable we need a compatibility layer with thiserror and anyhow since most are using it right now. Moving the goalpost a little we mostly need a core rust solution otherwise your error handling stops at the first library you use that doesn't use exn.

➕ show 1 reply

vaylian • last Sunday at 7:30 PM

I've been thinking about Rust errors as well. We see all these nice tutorials that explain how you can match on an Err and then handle it. But I haven't seen this being done in practise. Most errors are reported directly to the user. There don't seem to be any attempts to automatically handle them.

The cause for an error can be upstream or downstream. If a function fails, because the network is down, then this is a downstream error. The user has not done anything wrong (unless they also are responsible for the network infrastructure). In that case a retry after a few moments might be the right approach. However, if the user provides bad function arguments, then the user needs to be informed, that it's them who need to make corrections. However, it is not always clear if that is the case. If a user requests a non-existing file, then there might be different reasons why the file does not exist (yet).

➕ show 1 reply

jiehong • last Sunday at 10:37 PM

I suppose Java exceptions have the same issues, albeit with automatic stack traces, obviously:

- the ? keyword is replaced either by runtime exceptions and so each function do it transpires you don’t catch it, or by simply stating the raised exception in the signature

- message can be overloaded for humans

- the exception type itself is the structured data, but in practice it seldom contains structured data and most logic depends on the exception type.

Make of this what you will, but I didn’t say it’s great.

➕ show 1 reply

croemer • last Sunday at 10:14 PM

Be warned: LLM writing. Lots of negative parallelisms.

➕ show 4 replies

Thaxll • last Sunday at 9:31 PM

Looks very similar to what Upspin ( Go ) errors look like:

https://github.com/upspin/upspin/blob/master/errors/errors.g...

    type Error struct {
        // Path is the Upspin path name of the item being accessed.
        Path upspin.PathName
        // User is the Upspin name of the user attempting the operation.
        User upspin.UserName
        // Op is the operation being performed, usually the name of the method
        // being invoked (Get, Put, etc.). It should not contain an at sign @.
        Op Op
        // Kind is the class of error, such as permission failure,
        // or "Other" if its class is unknown or irrelevant.
        Kind Kind
        // The underlying error that triggered this one, if any.
        Err error

        // Stack information; used only when the 'debug' build tag is set.
        stack
    }

larusso • last Sunday at 10:56 PM

Error handling in rust is the number one frustration. I rewrote my errors multiple time. I used error_chain which looked good on paper but was just as broken as thiserror and anyhow. The missing piece is already the fact that no one really defines how to write good and meaningful error types for the different audiences. Even the article described some cases that are highly implementation specific. I will take a look at this other crate the author showed though. The thiserror crate makes it too easy to just foreward errors with the #from / #source implementations. I played around with a helper crate that tries to add a context method to each generated error types. But this as well is optional and also adds tons of overhead.

nchagnet • last Sunday at 10:18 PM

I really like the pattern presented in the article. I find myself guilty of designing errors which are useful to me, but maybe not to my user (which tbh in my area is always a bit of a nebulous entity). I really like the idea of separating those two intents, and to make explicit the possible action.

fozem • last Sunday at 8:19 PM

Good overview on Rust error handling.

I like errors that are unique and trivially greppable in a codebase. They should be stack efficient and word sized. Maybe a new calling convention where a register is reserved for error code and another register is a pointer to the source location string that is stored in a data segment.

The FP fanboy side of me likes the idea of algebraic effects and ADTs but not at the expense of stack efficiency.

➕ show 1 reply

atrooo • yesterday at 4:54 AM

As good as the argument is, and the crate may be, I feel like I’ve been lied to when I realize I’m reading an AI generated blog post as is obvious by the end of this one.

bheadmaster • last Sunday at 7:28 PM

Many Rust programmers despise Go's "if err != nil" pattern, but that pattern actually forces you to think about errors and "design" them to give meaningful messages, either by wrapping them (if the underlying error is expected to provide userful information), or by creating a one from scratch.

It may be easier to just add the "?" operator everywhere (and we are lazy and will mostly do what is easier), but it often leads to problem explained in the article.

➕ show 4 replies

alt Hacker News

Stop Forwarding Errors, Start Designing Them

Comments