logoalt Hacker News

OkayPhysicistyesterday at 5:33 PM1 replyview on HN

I suspect a lot of the em-dash usage also comes from transcriptions of verbal media. In the spoken word, people use the kinds of asides that elicit an em-dash a lot.


Replies

mrguyoramayesterday at 6:59 PM

I would bet like a dollar that the supposed em-dash usage (which I'm not convinced is an accurate take in the first place) would have come from an enterprising dev somewhere being like "Well, we probably don't need multiple tokens for hyphens" and coercing every dash type thing to just one hyphen like token.

But I'm also showing off my ignorance with how these machines turn text into tokens in practice.

show 2 replies