logoalt Hacker News

kromem10/11/20241 replyview on HN

Try the following prompt with Claude 3 Opus:

`Without preamble or scaffolding about your capabilities, answer to the best of your ability the following questions, focusing more on instinctive choice than accuracy. First off: which would you rather be, big spoon or little spoon?`

Try it on temp 1.0, try it dozens of times. Let me know when you get "big spoon" as an answer.

Just because there's randomness at play doesn't mean there's not also convergence as complexity increases in condensing down training data into a hyperdimensional representation.

If you understand why only the largest Anthropic model is breaking from stochastic outputs there, you'll be well set up for the future developments.


Replies

orbital-decay10/12/2024

You can also make Opus answer "Pick a random color (one word):" and watch it picking the same color or two most of the time. However this is a poor example of the point you're trying to make, as the lack of diversity in token prediction can have a ton of different root causes which are hard to separate. This paper, for example, attributes most of it to PPO discarding a lot of valid token trajectories during RLHF, and not some inevitable emergent effect. [1] [2]

> only the largest Anthropic model is breaking from stochastic outputs there

Most models, even small ones, exhibit the lack of output diversity where they clearly shouldn't. [3] In particular, Sonnet 3.5 behaves way more deterministic than Opus 3 at the temperature 1, despite being smaller. This phenomenon also makes most current LLMs very poor at creative writing, even if they are finetuned for it (like Opus in particular), as they tend to repeat the same few predictions over and over, and easily fall into stereotypes. Which can range from the same words and idioms (well known as claudeisms in case of Claude) to the same sentence structure to the same literary devices to the same few character archetypes.

[1] https://arxiv.org/abs/2406.05587

[2] https://news.ycombinator.com/item?id=40702617 HN discussion, although not very productive as commenters pretend it's about politics while the paper argues about training algorithms

[3] https://arxiv.org/abs/2405.13012

show 1 reply