I read that too, but I wondered whether elementwise error is the right metric. Surely the actual er...

seanhunter • today at 3:49 PM • 1 reply • view on HN

I read that too, but I wondered whether elementwise error is the right metric. Surely the actual error metric should be to evaluate model performance for a conventional transformer model and then the same model with the attention mechanism replaced by this 4th order Taylor approximation?

Replies

vlovich123 • today at 4:27 PM

Bounded error weights by definition is a more strict evaluation criterion than “performance” metrics through running the model.

➕ show 1 reply

alt Hacker News

Replies