Yeah this simply wouldn't work. Models don't have any concept of "themselves". T...

qeternity • 01/21/2025 • 0 replies • view on HN

Yeah this simply wouldn't work. Models don't have any concept of "themselves". These are just large matrices of floating points that we multiply together to predict a new token.

The context size would have to be in the training data which would not make sense to do.

alt Hacker News