I'm pretty sure all of these LLMs operate in the black on inference costs.
If I were to set up a DGX200 in my garage, say the 5 year TCO is a million dollars. Split that among 500 people and we can get it done for maybe $30/mo per user in total operating cost. I would bet that these LLMs are far more oversubscribed than 500 subs per server.
I'm pretty sure all of these LLMs operate in the black on inference costs.
If I were to set up a DGX200 in my garage, say the 5 year TCO is a million dollars. Split that among 500 people and we can get it done for maybe $30/mo per user in total operating cost. I would bet that these LLMs are far more oversubscribed than 500 subs per server.