logoalt Hacker News

Show HN: Zero-power photonic language model–code

6 pointsby damir00today at 6:45 PM5 commentsview on HN

The model uses a 1024-dimensional complex Hilbert space with 32 layers of programmable Mach–Zehnder meshes (Reck architecture) and derives token probabilities directly via the Born rule.

Despite using only unitary operations and no attention mechanism, a 1024×32 model achieves coherent TinyStories generation after < 1.8 hours of training on a single consumer GPU.

This is Part 1 - the next step is physical implementation with $50 of optics from AliExpress.


Comments

cpldcputoday at 11:00 PM

"Zero power" does not include the power needed to translate information between electronic and optical domains and the light source itself.

tliltocatltoday at 8:26 PM

Stupid question - how is it even possible given that you lose information on each layer? And how do one implement a non-linear activation function without an amplifier of a sort?

show 1 reply
bastawhiztoday at 8:48 PM

This is a neat idea, but it's extremely light (no pun intended) on real details. Translating a simulation into real hardware that can do real computation in a reliable manner is properly hard. As much as I'd love to be an optimist about this project, I have to say I'll believe it when I see it actually running on a workbench.

If it does work, I think one of the biggest challenges will be adding enough complexity to it for it to do real, useful computation. Running the equivalent of GPT-2 is a cool tech demo, but if there's not an obvious path to scaling it up, it's a bit of a dead end.

ifuknowuknowtoday at 8:37 PM

meds