That's a huge resource cost though, and simply unnecessary. We should be building semantically valid HTML from the beginning rather than leaning on a GPU cluster to parse the function based on the entire HTML, CSS, and JS on the page (or a screenshot requiring image parsing by a word predictor).