I was intrigued by the idea that in the Manta object store you could schedule computations on the storage nodes. However I am not sure how much improvement that brings in practice. Any practical experience with this?
I did use it on a project, it was meh, alright? In the end the main cost of our processing wasn’t storage latency but code, and this quite arcane scheduler was a barrier too much for most of our team.
I believe it was removed shortly after i left the project..
bcantrill gave a great talk many years ago about compute-data locality. would be nice to know if those ideas panned out for some customers, but it seems the world has by-and-large continued to schlep data back and forth.
it's too bad too. The concepts behind Manta were such a great idea. I still want tools that combine traditional unix pipes with services that can map-reduce over a big farm of hyperconverged compute/storage. I'm somewhat surprised that the kubernetes/cncf-adjacent world didn't reinvent it.