This thing is very impressive.
The problem it solves is efficiently calculating the height of some wrapped text on a web page, without actually rendering that text to the page first (very expensive).
It does that by pre-calculating the width/height of individual segments - think words - and caching those. Then it implements the full algorithm for how browsers construct text strings by line-wrapping those segments using custom code.
This is absurdly hard because of the many different types of wrapping and characters (hyphenation, emoji, Chinese, etc) that need to be taken into account - plus the fact that different browsers (in particular Safari) have slight differences in their rendering algorithms.
It tests the resulting library against real browsers using a wide variety of long text documents, see https://github.com/chenglou/pretext/tree/main/corpora and https://github.com/chenglou/pretext/blob/main/pages/accuracy...
> This thing is very impressive.
Agreed! Text layout engines are stupidly hard. You start out thinking "It's a hard task, but I can do it" and then 3 months later you find yourself in a corner screaming "Why, Chinese? Why do you need to rotate your punctuation differently when you render in columns??"
This effort feeds back to the DOM, making it far more useful than my efforts which are confined to rendering multiline text on a canvas - for example: https://scrawl-v8.rikweb.org.uk/demo/canvas-206.html
I had struggled so much to measure text and number of lines when creating dynamic subtitles for remotion videos, not sure if it was my incompetence or a complexity with the DOM itself. I feel hopeful this will make it much easier :-)
> The problem it solves is efficiently calculating the height of some wrapped text on a web page, without actually rendering that text to the page first (very expensive).
But in the end, in a browser, the actual text rendering is still done by the browser?
It's a library that allows to "do stuff" before the browser renders the actual text, but by still having the browser render, eventually, the actual text?
Or is this thing actually doing the final rendering of the text too?
i wrote something similar for this purpose, but much simpler and in 2kb, without AI, about a year ago.
uWrap.js: https://news.ycombinator.com/item?id=43583478. it did not reach 11k stars overnight, tho :D
for ASCII text, mine finishes in 80ms, while pretext takes 2200ms. i haven't yet checked pretext for accuracy (how closely it matches the browser), but will test tonight - i expect it will do well.
let's see how close pretext can get to 80ms (or better) without adopting the same tricks.
https://github.com/chenglou/pretext/issues/18
there are already significant perf improvement PRs open right now, including one done using autoresearch.