logoalt Hacker News

cubefoxlast Sunday at 4:47 PM0 repliesview on HN

DeepSeek-v3.2 should be be better for long context because it is using (near linear) sparse attention.