Right, and what happens at that limit is most exciting! A model that has a cross entropy at that limit for a data stream of text, produces a stream of text that is both theoretically and practically indistinguishable from the original stream.
And so if the datastream has been produced by something intelligent, the resulting model is indistinguishable from that intelligence. That is the whole compression idea behind artificial intelligence.
The limit is not a bug, it's a feature!
It’s a bit like saying copy and paste must be intelligent because its output is indistinguishable from intelligent output. Reproduction of intelligence has never been equivalent to intelligence- that’s why we look down on copying, plagiarism or derivative work.