logoalt Hacker News

Lalabadieyesterday at 7:44 PM3 repliesview on HN

For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

LLMs have a bias towards expertise and confidence due to the proportion of books in their training set. They also lean towards an academic writing style for the same reason.

All this to say, if LLMs write like you were already writing, it means you have very good foundations. It's fine to avoid them out of fear, but you have this Internet stranger's permission to use your em dash pause to think "Oh yeah, I'm the reference for writing style."


Replies

pettersyesterday at 10:21 PM

I think that bias is not due to the proportion of books and more due to how they are fine-tuned after the pretraining.

the_aftoday at 1:28 AM

> For what it's worth, whatever LLMs do extensively, they do because it's a convention in well-established writing styles.

I think that's only part of the story. I think that while it's true what LLMs do is somehow represented in their corpus of training data, they also lack any understanding of how to adapt to the context, how to find a suitable "voice", and how not to overdo it, unless you explicitly prompt them otherwise, which is too much of a burden. Their default voice sucks, basically.

So let's say they learned to speak in Redditese. They don't know when not to speak in that voice. They always seem to be trying to make persuasive arguments, follow patterns of "It's not X. It's Y. And you know it (mic drop)." But real humans don't speak like this all the damn time. If you speak like this to your mom or to your closest friends, you're basically an idiot.

It's not that you cannot speak like this. It's that you cannot do it all the time. And that's the real problem with LLMs.

(Sorry, couldn't resist!)

djhnyesterday at 8:01 PM

Aren’t books massively outweighed by the crawled internet corpus?

show 1 reply