Yes, this work is focused on accelerating very small models, typically for real-time systems that require extremely low power or low latency.
One primary application of this work is in high-energy physics (https://home.cern/smarter-decisions-at-the-speed-of-collisio...). Ultrafast and real-time learning is also very applicable for problems in quantum computing, plasma control, etc. (https://arxiv.org/pdf/2602.02005).
I'm not in HFT, but I assume this is also an interesting applicable domain?