logoalt Hacker News

tannhaeuseryesterday at 3:45 PM3 repliesview on HN

Guess what, you're not required to open <html>, <head>, or <body> either. It all follows from SGML tag inference rules, and the rules aren't that difficult to understand. What makes them appear magical is WHATWG's verbose ad-hoc parsing algorithm presentation explicitly listing eg. elements that close their parents originally captured from SGML but having become unmaintained as new elements were added. This already started to happen in the very first revision after Ian Hickson's initial procedural HTML parsing description ([1]).

I'd also wish people would stop calling every element-specific behavior HTML parsers do "liberal and tag-soup"-like. Yes WHATWG HTML does define error recovery rules, and HTML had introduced historic blunders to accomodate inline CSS and inline JS, but almost always what's being complained about are just SGML empty elements (aka HTML void elements) or tag omission (as described above) by folks not doing their homework.

[1]: https://sgmljs.sgml.net/docs/html5.html#tag-omission (see also XML Prague 2017 proceedings pp. 101ff)


Replies

modelessyesterday at 5:57 PM

HTML becomes pretty delightful for prototyping when you embrace this. You can open up an empy file and start typing tags with zero boilerplate. Drop in a script tag and forget about getElementById(); every id attribute already defines a JavaScript variable name directly, so go to town. Today the specs guarantee consistent behavior so this doesn't introduce compatiblity issues like it did in the bad old days of IE6. You can make surprisingly powerful stuff in a single file application with no fluff.

I just wish browsers weren't so anal about making you load things from http://localhost instead of file:// directly. Someone ought to look into fixing the security issues of file:// URLs so browsers can relax about that.

show 7 replies
skobesyesterday at 7:27 PM

Omitting <body> can lead to weird surprises. I once had some JavaScript mysteriously breaking because document.body was null during inline execution.

Since then I always write <body> explicitly even though it is optional.