logoalt Hacker News

lmeyerovtoday at 5:34 AM0 repliesview on HN

We have this issue in GFQL right now. We wrote the first OSS GPU cypher query language impl, where we make a query plan of gpu-friendly collective operations... But today their steps are coordinated via the python, which has high constant overheads.

We are looking to shed something of the python<->c++<->GPU overheads by pushing macro steps out of python and into C++. However, it'd probably be way better to skip all the CPU<>GPU back-and-forth by coordinating the task queue in the GPU to beginwith . It's 2026 so ideally we can use modern tools and type as safety for this.

Note: I looked at the company's GitHub and didn't see any relevant oss, which changes the calculus for a team like our's. Sustainable infra is hard!