FTA: In our "Mobile Actions" evaluation, fine-tuning transformed the model’s reliability, boosting accuracy from a 58% baseline to 85%. This confirms that for edge agents, a dedicated, trained specialist is an efficient path to production-grade performance.
I would be wary of having a LLM with 85% accuracy call tools on my system. Isn’t that fairly far away from production-grade performance?
I also don’t see that the fact that accuracy can be boosted from 50% to 85% is any indication that it can be boosted further.
Unbelievable shipping velocity from Google in December, and it sounds like they're not done for the week: https://x.com/osanseviero/status/2001723652635541566
Do you think this would be appropriate for a command line tool that hits various apis as the function calls? Ex: "what's the weather in SF tomorrow?" Or "daily price change of apple, Tesla stock for past week"? (Let's assume I have documented the apis thoroughly somewhere that the model has access to or fine tuned it on this data)
My brain didn’t realize that the parameters were megabytes and not gigabytes and my reaction went from “meh” to “holy bananas!”
Great work from the Google ML teams, I’ll be trying this model out.
I’ve been wanting to fine tune models for home assistant but unsure how to get some synthetic data, any recommendations?
Ollama link too: https://ollama.com/library/functiongemma
Hot take: Dodgy small/fast/cheap LLM in a While True approx. equals AGI for most real world tasks.
Hi all, I'm a research lead on this model. Same as every model release post, I enjoy working at Google for a multitude of reasons, and opinions here are my own.
Happy to answer whatever technical questions I can!