What is a bad idea? Allowing reasoning to happen in continuous space instead of discrete token space? This paper can be seen as a variant of the Coconut models (continuous chain of thought). Continuous reasoning is certainly more efficient when it works. Lack of interpret ability makes certain safety systems harder to enforce. Is that your point?
Yes. Coconut has the same issue. See also: a joint statement by researchers from several labs about CoT monitorability: https://arxiv.org/abs/2507.11473