logoalt Hacker News

seanwilsontoday at 4:06 PM13 repliesview on HN

Maybe I'm missing something and I'm glad this idea resonates, but it feels like sometime after Java got popular and dynamic languages got a lot of mindshare, a large chunk of the collective programming community forgot why strong static type checking was invented and are now having to rediscover this.

In most strong statically typed languages, you wouldn't often pass strings and generic dictionaries around. You'd naturally gravitate towards parsing/transforming raw data into typed data structures that have guaranteed properties instead to avoid writing defensive code everywhere e.g. a Date object that would throw an exception in the constructor if the string given didn't validate as a date (Edit: Changed this from email because email validation is a can of worms as an example). So there, "parse, don't validate" is the norm and not a tip/idea that would need to gain traction.


Replies

pjeremtoday at 4:12 PM

> In most strong statically typed languages, you wouldn't often pass strings and generic dictionaries around.

In 99% of the projects I worked on my professional life, anything that is coming from an human input is manipulated as a string and most of the time, it stays like this in all of the application layers (with more or less checks in the path).

On your precise exemple, I can even say that I never saw something like an "Email object".

show 10 replies
chriswarbotoday at 5:21 PM

> You'd naturally gravitate towards parsing/transforming raw data into typed data structures that have guaranteed properties instead to avoid writing defensive code everywhere e.g. a Date object that would throw an exception in the constructor if the string given didn't validate as a date

It's tricky because `class` conflates a lot of semantically-distinct ideas.

Some people might be making `Date` objects to avoid writing defensive code everywhere (since classes are types), but...

Other people might be making `Date` objects so they can keep all their date-related code in one place (since classes are modules/namespaces, and in Java classes even correspond to files).

Other people might be making `Date` objects so they can override the implementation (since classes are jump tables).

Other people might be making `Date` objects so they can overload a method for different sorts of inputs (since classes are tags).

I think the pragmatics of where code lives, and how the execution branches, probably have a larger impact on such decisions than safety concerns. After all, the most popular way to "avoid writing defensive code everywhere" is to.... write unsafe, brittle code :-(

masklinntoday at 5:10 PM

> it feels like sometime after Java got popular [...] a large chunk of the collective programming community forgot why strong static type checking was invented and are now having to rediscover this.

I think you have a very rose-tinted view of the past: while on the academic side static types were intended for proof on the industrial side it was for efficiency. C didn't get static types in order to prove your code was correct, and it's really not great at doing that, it got static types so you could account for memory and optimise it.

Java didn't help either, when every type has to be a separate file the cost of individual types is humongous, even more so when every field then needs two methods.

> In most strong statically typed languages, you wouldn't often pass strings and generic dictionaries around.

In most strong statically typed languages you would not, but in most statically typed codebases you would. Just look at the Windows interfaces. In fact while Simonyi's original "apps hungarian" had dim echoes of static types that got completely washed out in system, which was used widely in C++, which is already a statically typed language.

show 1 reply
munificenttoday at 6:36 PM

> You'd naturally gravitate towards parsing/transforming raw data into typed data structures that have guaranteed properties instead to avoid writing defensive code everywhere e.g.

There's nothing natural about this. It's not like we're born knowing good object-oriented design. It's a pattern that has to be learned, and the linked article is one of the well-known pieces that helped a lot of people understand this idea.

bcrosby95today at 4:11 PM

In my experience that's pretty rare. Most people pass around string phone numbers instead of a phonenumber class.

Java makes it a pain though, so most code ends up primitive obsessed. Other languages make it easier, but unless the language and company has a strong culture around this, they still usually end up primitive obsessed.

show 1 reply
css_apologisttoday at 4:43 PM

This is an idea that is not ON or OFF

You can get ever so gradually stricter with your types which means that the operations you perform on on a narrow type is even more solid

It is also 100% possible to do in dynamic languages, it's a cultural thing

noelwelshtoday at 4:57 PM

In 2 out of 3 problematic bugs I've had in the last two years or so were in statically typed languages where previous developers didn't use the type system effectively.

One bug was in a system that had an Email type but didn't actually enforce the invariants of emails. The one that caused the problem was it didn't enforce case insensitive comparisons. Trivial to fix, but it was encased in layers of stuff that made tracking it down difficult.

The other was a home grown ORM that used the same optional / maybe type to represent both "leave this column as the default" and "set this column to null". It should be obvious how this could go wrong. Easy to fix but it fucked up some production data.

Both of these are failures to apply "parse, don't validate". The form didn't enforce the invariants it had supposedly parsed the data into. The latter didn't differentiate two different parsing.

show 1 reply
Archelaostoday at 4:41 PM

Strong static type checking is helpful when implementing the methodology described in this article, but it is besides its focus. You still need to use the most restrictive type. For example, uint, instead of int, when you want to exclude negative values; a non-empty list type, if your list should not be empty; etc.

When the type is more complex, specific contraints should be used. For a real live example: I designed a type for the occupation of a hotel booking application. The number of occupants of a room must be positiv and a child must be accompanied by at least one adult. My type Occupants has a constructor Occupants(int adults, int children) that varifies that condition on construction (and also some maximum values).

show 1 reply
jackpiratetoday at 4:51 PM

> Edit: Changed this from email because email validation is a can of worms as an example

Email honestly seems much more straightforward than dates... Sweden had a Feb 30 in 1712, and there's all sorts of date ranges that never existed in most countries (e.g. the American colonies skipped September 3-13 in 1752).

show 1 reply
conartist6today at 4:29 PM

I think you're quite right that the idea of "parse don't validate" is (or can be) quite closely tied to OO-style programming.

Essentially the article says that each data type should have a single location in code where it is constructed, which is a very class-based way of thinking. If your Java class only has a constructor and getters, then you're already home free.

Also for the method to be efficient you need to be able to know where an object was constructed. Fortunately class instances already track this information.

brooke2ktoday at 4:57 PM

this is very much a nitpick, but I wouldn't call throwing an exception in the constructor a good use of static typing. sure, it's using a separate type, but the guarantees are enforced at runtime

show 3 replies
yakshaving_jgttoday at 4:11 PM

It's a design choice more than anything. Haskell's type safety is opt-in — the programmer has to actually choose to properly leverage the type system and design their program this way.

wat10000today at 4:24 PM

I'm not sure, maybe a little bit. My own journey started with BASIC and then C-like languages in the 80s, dabbling in other languages along the way, doing some Python, and then transitioning to more statically typed modern languages in the past 10 years or so.

C-like languages have this a little bit, in that you'll probably make a struct/class from whatever you're looking at and pass it around rather than a dictionary. But dates are probably just stored as untyped numbers with an implicit meaning, and optionals are a foreign concept (although implicit in pointers).

Now, I know that this stuff has been around for decades, but it wasn't something I'd actually use until relatively recently. I suspect that's true of a lot of other people too. It's not that we forgot why strong static type checking was invented, it's that we never really knew, or just didn't have a language we could work in that had it.