Attention is calculated during the forward pass of the model, which happens in both inference (forwa...

omneity • today at 4:07 PM • 1 reply • view on HN

Attention is calculated during the forward pass of the model, which happens in both inference (forward only) and training (forward & backward).

SubiculumCode • today at 4:41 PM

Dumb question: Can inference be done in a reverse pass? Outputs predicting inputs?

➕ show 3 replies

alt Hacker News