logoalt Hacker News

anonymousDanyesterday at 4:40 AM1 replyview on HN

What would be a typical/recommended server setup for using this for RAG? Would you typically have a separate server for the GPUs and the DB itself?


Replies

xavcochranyesterday at 8:24 AM

Assuming you are using GPUs for model inference, the best way to set it up would have the DB and a separate server to send inference requests. Note that we plan on support custom model endpoints and on the database side so you probably won't need the inference server in the future!