We need benchmarks that can distinguish between continuous learning and long-context extrapolation.

jballanc • yesterday at 10:09 PM • 0 replies • view on HN

alt Hacker News