The specific NPU doesn't seem to be mentioned in TFA, but my guess is that the blessed way to deal with it is the Neon SDK: https://www.arm.com/technologies/neon
I've not found Neon to be fun or easy to use, and I frequently see devices ignoring the NPU and inferring on CPU because it's easier. Maybe you get lucky and someone has made a backend for something specific you want, but it's not common.
TFA does directly mention the NPU "Arm-China Zhouyi: 30 TOPS (Dedicated)"
"you cannot simply use standard versions of PyTorch or TensorFlow out of the box. You must use the NeuralONE AI SDK."
Neon is a SIMD instruction set for the CPU, not a separate accelerator. It doesn't need an SDK to use, it's supported by compiler intrinsics and assembly language in any modern ARM compiler.