> I’ll often kick off a process at the end of my day, or over lunch. I don’t need it to run immediately. I’d be fine if it just ran on their next otherwise-idle gpu at much lower cost that the standard offering.
If it's not time sensitive, why not just run it at on CPU/RAM rather than GPU.
Does that even work out to be cheaper, once you factor in how much extra power you'd need?
Yeah just run a LLM with over 100 billion parameters on a CPU.