Interesting project, very dense post. I like the idea of a genuine personal search engine. You’d think that Windows and MacOS would do this well, but they really don’t.
Project GitHub is here https://github.com/eagledot/hachi
I've been hoping to see something like this, as finding or rediscovering images that I've archived has been a painful process for some years now.
Still, I've come to the conclusion that search alone - especially LLM-based search - isn't enough for these applications, because of its volatility. Human spatial localization relies on object permanence, so there needs to be some amount of durability baked into at least some of the functions of any application that involves us storing and retrieving desired objects and data.
I don't know precisely what that looks like, but I do know that, for example, whenever YouTube refreshes a recommended video list, I miss the days when those lists were largely fixed for days or weeks.
>My try has been to expose multiple (if not all) attributes for a resource directly to user and then letting user recursively refine query to get to desired result.
I do really like this part, though. I'd rather photos get tagged with as many (possibly erroneous) attributes as possible, and let me carve out what I'm really looking for, rather than missing the one I wanted because the system mistook a seesaw for a teeter-totter or something.
You can hack together an image search with a 500k VLM and a tiny embedding model that works surprisingly well. I built a tool like this 2 years ago that I can throw a hard drive at and any and all image files are processed and searchable locally, including video frames.
Hi, Author here!
I have been working on this project for quite some time now. Even though for such search engines, basic ideas remain the same i.e extracting meta-data or semantic info, and providing an interface to query it. Lots of effort have gone into making those modules performant while keeping dependencies minimal. Current version is down to only 3 dependencies i.e numpy, markupsafe, ftfy and a python installation with no hard dependence on any version. A lot of code is written from scratch including a meta-indexing engine and minimal vector database. Being able to index any personal data from multiple devices or service without duplicating has been the main them of the project so far!
We (My friend) have already tested it on around 180gb of Pexels dataset and upto 500k of flickr 10M dataset. Machine learning models are powered by a framework completely written in Nim (which is currently not open-source) and has ONEDNN as only dependency (which has to be do away to make it run on ARM machines!)
I have been mainly looking for feedback to improve upon some rough edges, but it has been worthwhile to work upon this project and includes code written in assembly to html !