logoalt Hacker News

jgreidtoday at 4:17 PM2 repliesview on HN

Isn't this simply context pruning/optimization?


Replies

kylemaxwelltoday at 4:21 PM

From the abstract, it looks like it's actually doing something deeper, updating weights in part of the model?

colechristensentoday at 4:34 PM

No, they're actually training weights based on context before compaction. Context is context, this is splitting the model into persistent weights and malleable ones which are periodically updated.

show 1 reply