So is it possible to load the ollama deepseek-r1 70b (43gb) model on my 24gb vram + 32gb ram machine? Does this depend on how I load the model, i.e., with ollama instead of other alternatives? Afaik, ollama is basically llama.cpp wrapper.
I have tried to deploy one myself with openwebui+ollama but only for small LLM. Not sure about the bigger one, worried if that will crash my machine someway. Are there any docs? I am curious about this and how that works if any.