This is a novel re-interpretation of the Transformer, based on my previous research made with a library called `arrowspace`.
It is somehow what is called a "Grassmann-like flow" but without the Plucker embedding, or also similar to what is done in DavisTensor but relying on spectral Laplacian instead of purely geometric distances.
The problem with a lot of stuff done before is that it focuses on dense representations. This architecture is focuses on sparse representation and provides a new approximation computation based on energy-informed graphs.