Yes — cargo run -p mnemo-bench. Ships with 12 benchmarks. Full retrieval pipeline is ~4ms on debug build. Numbers are in the README performance table.
I don't care if it's fast, if it makes the model dumber by cluttering up context.
I don't care if it's fast, if it makes the model dumber by cluttering up context.