logoalt Hacker News

sdpmastoday at 12:37 AM1 replyview on HN

diffusion is promising, but still an open question how much data efficient they are compared to AR. in practice, you can also train AR forever with high enough regularization, so let's see.


Replies

_0ffhtoday at 12:39 AM

Yes, it could go either way of course.

Still, just for reference, here's the paper I remembered: https://arxiv.org/pdf/2507.15857

show 1 reply