https://github.com/ggozad/haiku.rag/ - the embedded lancedb is convenient and has benchmarks; uses docling. qwen3-embedding:4b, 2560 w/ gpt-oss:20b.