What would a diffusing reasoning model look like? have a pre-defined length [thinking] block that ge...

schmorptron • today at 5:44 PM • 1 reply • view on HN

What would a diffusing reasoning model look like? have a pre-defined length [thinking] block that gets diffused over a long time, and then the final output block uses what is in that thinking block as part of its input? And how do diffusion models decide the output length in the first place, is it a pre-set parameter? or does it diffuse an [end] token into the middle somewhere?

Replies

schmorptron • today at 5:47 PM

got one answer by reading the rest of the comments, makes sense that the diffusion process is inherently reasoning-like: https://www.inceptionlabs.ai/blog/introducing-mercury-2

alt Hacker News

Replies