Every time I see the em-dash call out on here I get defensive because I’ve been writing like that forever! Where do people think that came from anyway? It’s obviously massively represented in the training data!
Where's the emdash key on your keyboard?
There isn't one?
Oh, maybe that's why people who didn't already know or care about emdashes are very alert to their presence.
If you have to do something very exotic with keypresses or copypaste from a tool or build your own macro to get something like an emdash, or , it's going to stand out, even if it's an integral part of standard operating systems.
The AIs aren't using emdashes because they're "massively represented in the training data". I don't understand why people think everything in a model output is strictly related to its frequency in pretraining.
They're emdashing because the style guide for posttraining makes it emdash. Just like the post-training for GPT 3.5 made it speak African English and the post-training for 4o makes it say stuff like "it's giving wild energy when the vibes are on peak" plus a bunch of random emoji.