My favorite Google LLM benchmark is asking Gemini models to create a script that fetches API usage (...

alasano • today at 12:54 PM • 0 replies • view on HN

My favorite Google LLM benchmark is asking Gemini models to create a script that fetches API usage (just request counts) for a project from GCP.

100% failure rate.

alt Hacker News