that 1M context thing, I wonder if it's just some abstraction thing where it compresses/su...

r_lee • today at 2:37 PM • 1 reply • view on HN

that 1M context thing, I wonder if it's just some abstraction thing where it compresses/sums up parts of the context so it fits into a smaller context window?

Replies

strongpigeon • today at 3:23 PM

You don’t normally compress the system prompts, though I guess maybe it treats its own summary with more authority. This article [0] talks about the problem very well.

Though I feel it’s most likely because models tend to degrade on large context (which can be seen experimentally). My guess is that they aren’t RLed on large context as much, but that’s just a guess.

[0]: https://openai.com/index/instruction-hierarchy-challenge/

alt Hacker News

Replies