> Then there was the problem of fragility: any syntax problems with your XHTML and your users wou...

RadiozRadioz • last Saturday at 10:16 AM • 5 replies • view on HN

> Then there was the problem of fragility: any syntax problems with your XHTML and your users would get a blank screen

I don't call that fragile, I call that well-founded. It has always perturbed me that, when encountering an error, HTML parsers will guess what they think you meant instead of throwing it back. I don't want my parser to do guesswork with potentially undefined behavior. I don't want my mistakes to be obscured so they can later come back to bite me - I want to be called out on issues loud and clear before my users see them.

Perhaps it works under the context of manually-authored markup with minimal effort, so I can see why the choice was made. These days it's yet another reason why the web is a precarious pile of sticks. HTML freely lets you put a broken oddly-shaped stick right in the middle and topple the whole stack.

The people turning the web from a handcrafted document sharing system into the world's premiere application platform should have made XHTML win.

Replies

robin_reala • last Tuesday at 9:39 PM

It’s Postel’s law at the end of the day: “be conservative in what you do, be liberal in what you accept from others”. As a site owner I want my site to fail loudly and quickly before a user sees a problem; as a user I never want to see a problem.

ePub is in a nice place: the number of documents to check for errors is reasonable, and the resulting artefect is designed to be shipped and never (or rarely) amended. That means that we can shift the balance towards strict parsing. But for a web site of thousands (or millions) of documents that are being amended regularly, the balance shifts back to loose parsing as the best way of meeting user needs.

➕ show 2 replies

taeric • last Tuesday at 10:08 PM

My favorite is how this interacts with the oh so fun mistake many people make of adding a `<div/>` thinking they are doing it right.

➕ show 1 reply

quotemstr • last Wednesday at 3:40 AM

It's hilarious that browsers will use loosy goosy parsing on the HTML of a web page but strictly interpret a JPEG to which its img tag refers.

Why? Why do car rentals typically not require cash up front, but hotel rentals universally do? The economics are similar. Sometimes it's simple path dependency. Something is a certain way because it's always been that way and it'd be too expensive to change it now.

At least browsers all use the same loosy goosy HTML parsing now. It was hell on earth when each browser had its own error recovery strategy. In a sense, there is no longer any such thing as an invalid HTML document: every code point sequence has a canonical interpretation now.

michaelmrose • last Tuesday at 10:12 PM

What about evolving standards in a system that must handle clients or servers which implement anything from tomorrows feature today to 10 years prior. Shouldn't failures be as graceful as possible?

munificent • last Tuesday at 11:47 PM

The problem is that you are collapsing two users with very different needs into a single one.

1. If you are authoring an XHTML file, yes, you want the renderer to be as picky as possible and yell loud and clear if you make a mistake. This helps you produce a valid, well-formed document.

2. If you are an end user reading an XHTML file, it's not your file, it's not your fault if there are bugs, and there's jack shit you can do to fix it. You just want the browser to do its best to show you something reasonable so you can read the page and get on with your life.

XHTML optimizes for 1 at the expense of 2. HTML5 optimizes for 2 at the expense of 1.

➕ show 5 replies

alt Hacker News

Replies