you can do that with Morphik already :)
We use an embedding model that processes videos and allows you to perform RAG on them.