If the model is trained to be a interpreter, then that means that the loss should reach 0 for it to ...

SPascareli13 • today at 4:27 PM • 0 replies • view on HN

If the model is trained to be a interpreter, then that means that the loss should reach 0 for it to be fully trained?

Also, if it's execution is purely deterministic, you probably don't need non linearity in the layers, right?

alt Hacker News