This seems like it has some potential, but is pretty much useless as it is.
Shame there are no weights released - let alone the "compiler" tool they used to actually synthesize computational primitives into model weights. It seems like a "small model" system that's amenable to low budget experiments, and I would love to see what this approach can be pushed towards.
I disagree with the core premise, it's basically the old neurosymbolic garbage restated, but embedding predefined computational primitives into LLMs could have some uses nonetheless.
What's "the old neurosymbolic garbage"?
If you want to experiment with hardcoding small programs into transformer weights, maybe try ALTA: https://arxiv.org/abs/2410.18077v2