Circa '99 a high fraction (50%-ish) of HTML in the field was invalid, so if you were making a new web browser it had to parse invalid HTML the same way as Netscape which was one more reason we didn't get competitive web browsers.
HTML 5 specified exactly how "invalid" HTML is parsed so now there is no such thing as invalid HTML. XHTML was one of those things that never quite worked:
> there is no such thing as invalid HTML
There is. There are things that are still considered invalid, like nesting form elements for instance.
(this doesn't take away your argument though, and you were focusing on the parsing aspect).