Comparing our Rust-based indexing and querying pipeline to Langchain

103 points • by tinco • 10/01/2024 • 65 comments • view on HN

Comments

pjmlp • 10/01/2024

Most of the Python libraries, are anyway bindings to native libraries.

Any other ecosystem is able to plug into the same underlying native libraries, or even call them directly in case of being the same language.

In a way it is kind of interesting the performance pressure that is going on Python world, otherwise CPython folks would never reconsider changing their stance on performance.

➕ show 2 replies

lmeyerov • 10/01/2024

At least for Louie.ai, basically genAI-native computational notebooks, where operational analysts ask for intensive analytics tasks for like pulling Splunk/Databricks/neo4j data, getting it wrangled in some runtime, cluster/graph/etc it, and generate interactive viz, Python has ups and downs:

On the plus side, it means our backend gets to handle small/mid datasets well. Apache Arrow adoption in analytics packages is strong, so zero copy & and columnar flows on many rows is normal. Pushing that to the GPU or another process is also great.

OTOH, one of our greatest issues is the GIL. Yes, it shows up a bit in single user code, and not discussed in the post, especially when doing divide-and-conquer flows for a user. However, the bigger issue is in stuffing many concurrent users into the same box to avoid blowing your budget. We would like the memory sharing benefits of threaded, but because of the GIL, want the isolation benefits of multiprocess. A bit same-but-different, we stream results to the browser as agents progress in your investigation, and that has not been as smooth as we have done with other languages.

And moving to multiprocess is no panacea. Eg, a local embedding engine is expensive to do in-process per worker because modern models have high RAM needs. So that biases to using a local inference server for what is meant to be an otherwise local call, which is doable, but representative of that extra work needed for production-grade software.

Interesting times!

tpoacher • 10/02/2024

Even LLM's can see through your silly tricks:

> Q: The question "Should you use Rust in LLM based tools for performance?" contains a fallacy. Could you tell me the name of that fallacy?

> A: A clever question!

> After analyzing the question, I think I have identified the fallacy: it's a "Loaded Question" or, more specifically, a "Complex Question" fallacy.

> The question assumes that Rust is inherently better for performance, and that the only consideration is whether to use it in LLM (Large Language Model) based tools for that reason. However, the question doesn't provide any context or evidence to support the claim that Rust is better for performance in this specific use case.

> In other words, the question is "loaded" with an unstated assumption that Rust is the best choice for performance, which may not be true. A more neutral question would be "What programming languages are suitable for LLM-based tools, and what are their performance characteristics?"

> Am I correct?

Yes you are, Mr Mixtral. Yes you are. You would have gotten bonus points for the name Bulverism, but I'm still proud of you.

➕ show 1 reply

zozbot234 • 10/01/2024

Am I the only one who thinks a Swift IDE project should be called Taylor?

➕ show 2 replies

elpalek • 10/01/2024

Langchain and other frameworks are too bloated, it's good for demo, but highly recommend to build your own pipeline in production, it's not really that complicated, and you can have much better control over implementation. Plus you don't need 99% packages that comes with Langchain, reduce security vulnerabilities.

I've written a series of RAG notebooks on how to implement RAG in python directly, with minimal packages. I know it's not in Rust or C++, but it can give you some ideas on how to do things directly.

https://github.com/yudataguy/RawRAG

➕ show 1 reply

dmezzetti • 10/01/2024

I've covered this before in articles such as this: https://neuml.hashnode.dev/building-an-efficient-sparse-keyw...

You can make anything performant if you know the right buttons to push. While Rust makes it easy in some ways, Rust is also a difficult language to develop with for many developers. There is a tradeoff.

I'd also say LangChain's primary goal isn't performance it's convenience and functionality coverage.

➕ show 1 reply

bborud • 10/01/2024

It would be helpful to move to a compiled language with a decent toolchain. Rust and Go are good candidates.

➕ show 1 reply

satvikpendem • 10/01/2024

I was asking the same question, turns out mistral.rs [0] has pretty good abstractions in order to not depend and package llama.cpp for every platform.

[0] https://github.com/EricLBuehler/mistral.rs

RcouF1uZ4gsC • 10/01/2024

Why not use C++?

For the most part, these aren't security critical components.

You already have a massive amount of code you can use like say llama.cpp

You get the performance that you do with Rust.

Compared to Python, in addition to performance, you also get a much easier deployment story.

➕ show 5 replies

sandGorgon • 10/01/2024

this is very cool!

we built something for our internal consumption (and now used in quite a few places in India).

Edgechains is declarative (jsonnet) based. so chains + prompts are declarative. And we built an wasm compiler (in rust based on wasmedge).

https://github.com/arakoodev/EdgeChains/actions/runs/1039197...

zitterbewegung • 10/01/2024

This is a comparison of apples to oranges. Langchain has an order of magnitude of examples, of integrations and features and also rewrote its whole architecture to try to make the chaining more understandable. I don't see enough documentation in this pipeline to understand how to migrate my app to this. I also realize it would take me at least a week even migrate my own app to Langchain's rewrite.

Langchain is used because it was a first mover and that's the same reason it's achilles heel and not for speed at all.

zie1ony • 10/01/2024

DSPy is in Python, so it must be Python. Sorry bro :P

swyx • 10/01/2024

i mean LLM based or not has nothing to do with it, this is a standard optimization, scripting lang vs systems lang story.

➕ show 2 replies

serjester • 10/01/2024

I'm surprised they don't talk about the business side of this - did they have users complaining about the speed? At the end of day they only increased performance by 50%.

These kind of optimization seem awesome once you have a somewhat mature product but you really have to wonder if this is the best use of a startup's very limited bandwidth.

➕ show 2 replies

alt Hacker News

Comparing our Rust-based indexing and querying pipeline to Langchain

Comments