Google makes claims here about high demand for Gemini - does anyone here have insight into how much of the load on Google is paid use vs the load from putting AI summaries into every web search?
My curiosity is not the free AI summaries (which they can opaquely tune as necessary), but instead the renting of TPUs to Anthropic and OpenAI. Many of these contracts were announced last minute and seemed to involve a very desperate Anthropic. Based on the Anthropic/xAI data center contract, they’re willing to pay crazy markup to get immediate access to compute.
I want to know how impacted Gemini has been by that, because that will reveal a lot about their margins and revenue generating first party demand. Each MSFT earnings report they discuss the balance they’re dealing with between supplying GPUs to Azure customers and first party demand.
My pet theory is that Gemini is “losing” the LLM race because they’re preferentially selling the TPUs to competitors, while keeping just enough for themselves to stay competitive and build their own products.
We use Gemini for some specific tasks. It is often unavailable due to capacity limits or other downtime.
It's probably the best multimodal model I've worked with (if somebody knows a better one for audio analysis, please let me know!)
I don't know numbers, but their APIs have a bad uptime in my experience for some models. Too often failure because of "traffic too high".
It's worse than OpenAI or Anthropic. However their lower tier consumer offerings can sometimes be had for <$10/mo on offer and come bundled with other Google services like cloud storage.
Don't know, but Gemini 3.1 flash lite is available for free under relatively generous limits, and it had lots of random interruptions like when I was testing it. (Intermittently responding with errors due to high load.)