They also advertised a 36,000x speedup over equivalent Python if I remember correctly, without at any point clarifying that this could only be true in extreme edge cases. Feels more like a pump-dump cryptography scheme than an honest attempt to improve the Python ecosystem.
Well... the article made self deprecating fun of the click bait title, showed the code every step of the way, and actually did achieve the claim (albeit with wall clock time, not CPU/GPU time).
And it wasn't "equivalent python", whatever that means, they did loop unrolling and SIMD and stuff. That can't be done in pure python at all, so there literally is no equivalent python.