logoalt Hacker News

r_leetoday at 2:37 PM1 replyview on HN

that 1M context thing, I wonder if it's just some abstraction thing where it compresses/sums up parts of the context so it fits into a smaller context window?


Replies

strongpigeontoday at 3:23 PM

You don’t normally compress the system prompts, though I guess maybe it treats its own summary with more authority. This article [0] talks about the problem very well.

Though I feel it’s most likely because models tend to degrade on large context (which can be seen experimentally). My guess is that they aren’t RLed on large context as much, but that’s just a guess.

[0]: https://openai.com/index/instruction-hierarchy-challenge/