logoalt Hacker News

rmunnlast Saturday at 9:07 PM1 replyview on HN

Emojis mixed with ASCII-era characters are hard to get right. Some terminal emulators get it right nearly all the time (e.g. Ghostty, which has had a lot of thought and effort put into getting it right) and yet there are still open issues in the Ghostty repo about inconsistent character width. There are just so many corner cases that it's hard.

That said, the edge alignment is, I believe, caused by the fact that LLMs are involved in the process. Because the LLMs never "see" the final visual representation that humans see. Their "view" of the world is text-based, and in the text file, those columns line up because they have the same number of UTF-8 codepoints in the row. So the LLMs do not realize that the right edges are misaligned visually. (And since the workflow described is for an LLM to take that text file as input and produce an output in React/Vue/Svelte/whatever, the visual alignment of the text file needs to stay LLM-oriented for it to work properly. I assume, of course, since I haven't tried this myself).


Replies

kevin_thibedeaulast Saturday at 10:28 PM

They are treated like double width characters. All it takes is a Unicode aware layout algorithm that tracks double width codepoints. The tricky part is older single width symbols that were originally not emoji and now have ambiguous width depending on the terminal environment's default presentation mode.

show 2 replies