Hi all, I'm a research lead on this model. Same as every model release post, I enjoy working at...

canyon289 • last Thursday at 6:41 PM • 14 replies • view on HN

Hi all, I'm a research lead on this model. Same as every model release post, I enjoy working at Google for a multitude of reasons, and opinions here are my own.

Happy to answer whatever technical questions I can!

Replies

A4ET8a8uTh0_v2 • yesterday at 4:06 AM

<< You are ready to fine-tune: You need the consistent, deterministic behavior that comes from fine-tuning on specific data, rather than the variability of zero-shot prompting. << You prioritize local-first deployment: Your application requires near-instant latency and total data privacy, running efficiently within the compute and battery limits of edge devices.

Thank you. I felt that was a very under appreciated direction ( most of the spotlight seemed to be on 'biggest' models ).

➕ show 1 reply

mrinterweb • last Thursday at 9:22 PM

I have often wondered how much a specialized local LLM could benefit an agentic tool like Gemini CLI. I would think there could be a good win for speed and minimizing token use if coding agents used a local model. A local model could handle a lot of the low level system interaction type tasks and then send the prompts that require deeper reasoning to frontier models. It seems wasteful and slow to use frontier models to figure out how to grep a codebase, run tests, git diff, etc.

Might Gemini CLI offload some of its prompts to FunctionGemma?

➕ show 1 reply

xnx • last Thursday at 7:16 PM

Cool game! Amazing it can run in the browser. My mind was blown when I saw you could give goal based commands vs prescriptive ones. https://huggingface.co/spaces/webml-community/FunctionGemma-...

➕ show 1 reply

exacube • last Thursday at 8:54 PM

Some fine tuning data questions:

i see the the dataset Google published in this notebook https://github.com/google-gemini/gemma-cookbook/blob/main/Fu... -- from looking at the dataset on huggingface, it looks synthetically generated.

1. do you recommend any particular mix or focus in the dataset for finetuning this model, without losing too much generality?

2. do you have any recommendations for how many examples per-tool?

thank you for your (and your teams) work!

➕ show 1 reply

NitpickLawyer • last Thursday at 7:32 PM

Wen gemma4? :)

But on a serious note, I'm happy to see more research going into vSLMs (very small...) My "dream" scenario is to have the "agentic" stuff run locally, and call into the "big guns" as needed. Being able to finetune these small models on consumer cards is awesome, and can open up a lot of niche stuff for local / private use.

➕ show 1 reply

vessenes • last Thursday at 8:41 PM

Hey! Love the Gemma series. Question that came to mind reading the announcement post - the proposal there is that you can use this as a local backbone and have it treat a larger model as a 'tool call' when more reasoning is needed.

In my mind we want a very smart layer frontier model orchestrating, but not slowing everything down by doing every little thing; this seems like the opposite - a very fast layer that can be like "wait a minute, I'm too dumb for this, need some help".

My question is - does the Gemma team use any evaluation around this particular 'call a (wiser) friend' strategy? How are you thinking about this? Is this architecture flow more an accommodation to the product goal - fast local inference - or do you guys think it could be optimal?

➕ show 1 reply

mudkipdev • yesterday at 12:08 AM

If I have a simple mainly question-answering AI using only a couple of tools (web search), am I better off starting with Gemma or FunctionGemma?

➕ show 1 reply

xnx • last Thursday at 7:41 PM

Not FunctionGemma related, but would love to see an open weights model from Google for speech to text transcription (diarization, timestamps, etc.).

Whisper is old and resource intensive for the accuracy it provides.

➕ show 1 reply

mentalgear • last Thursday at 9:45 PM

Much-appreciate the focus on local-first (on-device) ! I'm wondering how your approach differs from (or integrates with) something like "Differentiable Programming for LLM Tool Selection" https://viksit.substack.com/p/optimizing-tool-selection-for-...

➕ show 1 reply

zikani_03 • last Thursday at 8:29 PM

Thanks for all the great work. How good is the model at composing actions and is there a way to say, give the model ability to scope actions, for example if actions are related to permissions or some other context? Would one need to pass the role or permission as context or finetune separately?

I hope those questions make sense

➕ show 1 reply

lukeinator42 • last Thursday at 7:52 PM

Very cool! I was wondering, is a separate model performing speech recognition for the voice demos such as the game? The FunctionGemma model card only seems to show text input/output.

➕ show 1 reply

cbabraham • yesterday at 12:41 AM

hi! Does this bring us closer to a gemini-cli like experience using a local modal that can run on a macbook pro? It felt like gemma3n was already 'smart' enough it just wasn't tuned for tool use.

➕ show 1 reply

carlcortright • last Thursday at 7:25 PM

Very cool model! Congrats on the work!

➕ show 1 reply

ekianjo • yesterday at 3:21 AM

Does this require webgpu to run on the browser?

alt Hacker News

Replies