I have journaled digitally for the last 5 years with this expectation.
Recently I built a graphRAG app with Qwen 3.5 4b for small tasks like classifying what type of question I am asking or the entity extraction process itself, as graphRAG depends on extracted triplets (entity1, relationship_to, entity2). I used Qwen 3.5 27b for actually answering my questions.
It works pretty well. I have to be a bit patient but that’s it. So in that particular use case, I would agree.
I used MLX and my M1 64GB device. I found that MLX definitely works faster when it comes to extracting entities and triplets in batches.
Did you get any insights about yourself from this process? I am thinking of doing the same