Scaling Latent Reasoning via Looped Language Models

78 points • by remexre • last Saturday at 9:34 PM • 13 comments • view on HN

Comments

kelseyfrog • last Saturday at 11:04 PM

If you squint your eyes it's a fixed iteration ODE solver. I'd love to see a generalization on this and the Universal Transformer metioned re-envisioned as flow-matching/optimal transport models.

➕ show 2 replies

the8472 • last Saturday at 11:39 PM

Does the training process ensure that all the intermediate steps remain interepretable, even on larger models? Not that we end up with some alien gibberish in all but the final step.

➕ show 1 reply

lukebechtel • yesterday at 2:07 AM

so it's:

output = layers(layers(layers(layers(input))))

instead of the classical:

output = layer4(layer3(layer2(layer1(input))))

➕ show 1 reply

alt Hacker News

Scaling Latent Reasoning via Looped Language Models

Comments