logoalt Hacker News

jasonjmcgheeyesterday at 8:14 PM4 repliesview on HN

I think the presence of em dashes is a very poor metric for determining if something is AI generated. I'm not sure why it's so popular.


Replies

exmadscientistyesterday at 8:22 PM

For me it is that they are wrongly used in this piece. Em dashes as appositives have the feel of interruption—like this—and are to be used very sparingly. They're a big bump in the narrative's flow, and are to be used only when you want a big bump. Otherwise appositives should be set off with commas, when the appositive is critical to the narrative, or parentheses (for when it isn't). Clause changes are similar—the em dash is the biggest interruption. Colons have a sense of finality: you were building up to this: and now it is here. Semicolons are for when you really can't break two clauses into two sentences with a full stop; a full stop is better most of the time. Like this. And so full stops should be your default clause splice when you're revising.

Having em-dashes everywhere—but each one or pair is used correctly—smacks of AI writing—AI has figured out how to use them, what they're for, and when they fit—but has not figured out how to revise text so that the overall flow of the text and overall density of them is correct—that is, low, because they're heavy emphasis—real interruptions.

(Also the quirky three-point bullet list with a three-point recitation at the end with bolded leadoffs to each bullet point and a final punchy closer sentence is totally an AI thing too.)

But, hey, I guess I fit the stereotype!—I'm in Seattle and I hate AI, too.

show 4 replies
NewsaHackOyesterday at 8:41 PM

I think it's because it is difficult to actually add an em dash when writing with a keyboard (except I heard on Macs). So it's either they 1)memorized the em dash alt code, 2)had a keyboard shortcut for the key, or 3)are using the character map to insert it every time, all of which are a stretch for a random online post.

show 1 reply
jakubmazanecyesterday at 8:44 PM

Related article posted here https://news.ycombinator.com/item?id=46133941 explains it: "Within the A.I.’s training data, the em dash is more likely to appear in texts that have been marked as well-formed, high-quality prose. A.I. works by statistics. If this punctuation mark appears with increased frequency in high-quality writing, then one way to produce your own high-quality writing is to absolutely drench it with the punctuation mark in question. So now, no matter where it’s coming from or why, millions of people recognize the em dash as a sign of zero-effort, low-quality algorithmic slop."

mips_avataryesterday at 8:17 PM

So the funny thing is m dashes have always been a great trick to help your writing flow better. I guess gpt4o figured this out in RLHF and now it's everywhere