Yes it is, but I can imagine that they want to start out a bit smaller to see how well things scale, and/or did not yet have the time to work on optimizing for the large context windows.
I struggle to get quality results from the frontier models at contexts > 256k anyway.
I struggle to get quality results from the frontier models at contexts > 256k anyway.