This is impressive, how do people handle the limited context window of 64k tokens?

Alifatisk • 01/20/2025 • 2 replies • view on HN

Replies

int_19h • 01/20/2025

Same as they did it back in the "old days" when GPT-4 was 8k and LLaMA was 2k. Chunking, RAG etc, then cross your fingers and hope that it all works reasonably well.

m3kw9 • 01/20/2025

By using o1

alt Hacker News

Replies