After a quick content browse, my understanding is this is more like with a very compressed diff ve...

voxelghost • today at 2:11 AM • 0 replies • view on HN

After a quick content browse, my understanding is this is more like with a very compressed diff vector, applied to a multi billion parameter model, the models could be 'retrained' to reason (score) better on a specific topic , e.g. math was used in the paper

alt Hacker News