logoalt Hacker News

kingstnapyesterday at 1:05 PM0 repliesview on HN

You do that to make things smoother when plotted. You could in theory add some crazy stairstep that adds a hundred to the middle part. It would make your loss curves spike and increase towards convergence but then those spikes are just visual artifacts from doing weird discontinuous nonsense with yoru loss.