Oh, sounds interesting. I hadn't considered using a diffusion model for this. My current approach generates the document byte by byte with an autoregressive transformer, so I'm curious how a diffusion model would improve memorization or reconstruction quality.
Can you point me to something that i can read? I really wanna try this approach , diffusion model does sounds interesting for compression.