The AIs aren't using emdashes because they're "massively represented in the training ...

astrange • today at 7:49 AM • 1 reply • view on HN

The AIs aren't using emdashes because they're "massively represented in the training data". I don't understand why people think everything in a model output is strictly related to its frequency in pretraining.

They're emdashing because the style guide for posttraining makes it emdash. Just like the post-training for GPT 3.5 made it speak African English and the post-training for 4o makes it say stuff like "it's giving wild energy when the vibes are on peak" plus a bunch of random emoji.

Replies

antonvs • today at 9:46 AM

> Just like the post-training for GPT 3.5 made it speak African English

This is a misunderstanding. At best, some people thought that GPT 3.5 output resembled African English.

alt Hacker News

Replies