Serious question: why would you ever want to not close tags? It saves a couple of key strokes, but we have snippets in our editors, so the amount of typing is the same. Closed tags allow editors like Vim or automated tools to handle the source code easier; e.g. I can type `dit` in Vim to delete the contents of a tag, something that's only possible because the tag's content is clearly delimited. It makes parsing HTML easier because there are fewer syntax rules.
I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have. HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML. Even though I don't have to, I strive to adhere to the stricter rules of closing all tags, closing self-closing tags and only using lower-case tag names.
Why did markdown become popular when we already have html? Because markdown is much easier to write by hand in a simple text editor.
Original SGML was actually closer to markdown. It had various options to shorten and simplify the syntax, making it easy to write and edit by hand, while still having an unambiguous structure.
The verbose and explicit structure of xhtml makes it easier to process by tools, but more tedious for humans.
> I never understood why the more strict rules of XML for HTML never took off.
Because of the vast quantity of legacy HTML content, largely.
> HTML 5 was an opportunity to make a clear cut between legacy HTML and the future of HTML.
WHATWG and its living standard that W3C took various versions of and made changes to and called it HTML 5, 5.1, etc., to pretend that they were still relevant in HTML, before finally giving up on that entirely, was a direct result of the failure of XHTML and the idea of a clear cut between legacy HTML and the future of HTML. It was a direct reaction against the “clear cut” approach based on experience, not an opportunity to repeat its mistakes. (Instead of a clear break, HTML incorporated the “more strict rules of XML” via the XML serialization for HTML; for the applications where that approach offers value, it is available and supported and has an object model 100% compatible with the more common form, and they are maintained together rather than competing.)
Because I want my hand-written HTML to look more like markdown-style languages. If I close those tags it adds visual noise and makes the text harder to read.
Besides, at this point technologies like tree-sitter make editor integration a moot point: once tree-sitter knows how to parse it, the editor does too.
A lot of HTML tags never have a body, so it makes no sense to close them. XML has self-closing tag syntax but it wasn't always handled well by browsers.
A p or li tag, at least when used and nested properly, logically ends where either the next one begins or the enclosing block ends. Closing li also creates the opportunity for nonsensical content inside of a list but not in any list item. Of course all of these corner cases are now well specified because people did close their tags sometimes.
I would argue the stricter rules did take off, most people always close <p>, it's pretty common to see <img/> over <img>—especially from people who write a lot of React.
But.
The future of HTML will forever contain content that was first handtyped in Notepad++ in 2001 or created in Wordpress in 2008. It's the right move for the browser to stay forgiving, even if you have rules in your personal styleguide.
> I learned HTML quite late, when HTML 5 was already all the rage, and I never understood why the more strict rules of XML for HTML never took off. They seem so much saner than whatever soup of special rules and exceptions we currently have.
XHTML came out at a time when Internet Explorer, the most popular browser, was essentially frozen apart from security fixes because Microsoft knew that if the web took off as a viable application platform it would threaten Windows' dominance. XHTML 1.1 Transitional was essentially HTML 4.01 except that if it wasn't also valid XML, the spec required the browser to display a yellow "parsing error" page rather than display the content. This meant that any "working" XHTML site might not display because the page author didn't test in your browser. It also meant that any XHTML site might break at any time because a content writer used a noncompliant browser like IE 6 to write an article, or because the developers missed an edge case that causes invalid syntax.
XHTML 2.0 was a far more radical design. Because IE 6 was frozen, XHTML 2.0 was written with the expectation that no current web browser would implement it, and instead was a ground-up redesign of the web written "the right way" that would eventually entirely replace all existing web browsers. For example, forms were gone, frames were gone, and all presentational elements like <b> and <i> were gone in favor of semantic elements like <strong> and <samp> that made it possible for a page to be reasoned about automatically by a program. This required different processing from existing HTML and XHTML documents, but there was no way to differentiate between "old" and "new" documents, meaning no thought was given to adding XHTML 2.0 support to browsers that supported existing web technologies. Even by the mid-2000s, asking everyone to restart the web from scratch was obviously unrealistic compared to incrementally improving it. See here for a good overview of XHTML 2.0's failure from a web browser implementor's perspective: https://dbaron.org/log/20090707-ex-html
This really does feel like a job for auto-complete -slash- Generative ai tools.
Imagine if you were authoring and/or editing prose directly in html, as opposed to using some CMS. You're using your writing brain, not your coding brain. You don't want to think about code.
It's still a little annoying to put <p> before each paragraph, but not by that much. By contrast, once you start adding closing tags, you're much closer to computer code.
I'm not sure if that makes sense but it's the way I think about it.
In the case of <br/> and <img/> browsers will never use the content inside of the tag, so using a closing tag doesn't make sense. The slash makes it much clearer though, so missing it out is silly.
> why would you ever want to not close tags?
Because browsers close some tags automatically. And if your closing tag is wrong, it'll generate empty element instead of being ignored. Without even emitting warning in developer console. So by closing tags you're risking introducing very subtle DOM bugs.
If you want to close tags, make sure that your building or testing pipeline ensures strict validation of produced HTML.
> I never understood why the more strict rules of XML for HTML never took off
Internet Explorer failing to support XHTML at all (which also forced everyone to serve XHTML with the HTML media type and avoid incompatible syntaxes like self-closing <script />), Firefox at first failing to support progressive rendering of XHTML, a dearth of tooling to emit well-formed XHTML (remember, those were the days of PHP emitting markup by string concatenation) and the resulting fear of pages entirely failing to render (the so-called Yellow Screen of Death), and a side helping of the WHATWG cartel^W organization declaring XHTML "obsolete". It probably didn't help that XHTML did not offer any new features over tag-soup HTML syntax.
I think most of those are actually no longer relevant, so I still kind of hope that XHTML could have a resurgence, and that the tag-soup syntax could be finally discarded. It's long overdue.