logoalt Hacker News

Lerc01/15/20260 repliesview on HN

Thank you very much for this description.

If I understand it in a nutshell. If Gradient is the angle Hessian is the curvature.

and Jacobians let you know how much weights contributed to the blue component of something identified as a big blue cat.

I think.

Jacobians look like they could be used to train concept splitters. For instance if an LLM has a grab bag of possible conversation paths, the final embedding would have information for each path, but once the selection is made it could filter the embedding to that path, which would be beneficial for chain of thought using the filtered embedding instead of the predicted token. I always wondered how much the thinking in embedding space carried around remnants of conversation paths not taken.