logoalt Hacker News

Sirius DB

137 pointsby manojilast Wednesday at 5:54 AM17 commentsview on HN

Comments

adriancolast Saturday at 6:22 PM

I’ve talked to the authors of this, it’s a very interesting project. GPU memory space used to be the limitation but the latest generations of GPUs have enormous shared memory capacity and need something like SiriusDB to manipulate and prepare data in-place before the AI algorithms get to work.

manojilast Saturday at 7:49 PM

Its sitting at the top in clickbench .Pretty cool https://benchmark.clickhouse.com/#system=-&type=-&machine=-c...

show 3 replies
tobefranklinlast Saturday at 6:47 PM

There is also a recent blog post about this: https://developer.nvidia.com/blog/nvidia-gpu-accelerated-sir...

ledbityesterday at 4:35 AM

Some of the price performance improvement that is quoted is due to using $ from different cloud providers - eg a GH200 in Lambda Labs costs $1.5/hr, but the closest equivalent in AWS (p5.4xlarge) costs $6.88/hr. Which means, ~4.5x of the price performance benefits is not real ...

esafaklast Saturday at 6:38 PM

Reminds me of Uber's AresDB: https://www.uber.com/blog/aresdb/

sys13last Saturday at 7:33 PM

I wonder if the benefit is primarily for transactional vs analytical queries

show 1 reply
jauntywundrkindlast Saturday at 8:05 PM

From their Rethinking Analytical Processing in the GPU Era paper,

> Sirius builds on GPU libraries such as libcudf [6], RMM [14], and NCCL [11], reusing optimized implemen- tations of core relational operators like joins, filters, aggregations, and data shuffle. Thanks to its modular design, Sirius also allows developers to easily switch the operator implementation between these GPU libraries and custom CUDA kernels.

https://arxiv.org/abs/2508.04701

I wonder if the various other CUDA translation layers (ZLUDA, SCALE, HIP) can host this?

It'd be so nice to see a little more foothold for Vulkan in this space. There's some good work in AI for Vulkan, it's becoming quite capable. But for databases & GPGPU, it doesn't seem like there are good rallying points.

I expect whatever does eventually emerge will perhaps likely be based on Substrait too! What an awesome common grounds thats emerged for data processing work.

stogotlast Saturday at 5:59 PM

Sounds amazing; what are the downsides that a company needs to consider? Memory bottlenecks or storage bus access?

show 1 reply