logoalt Hacker News

I have written gemma3 inference in pure C

46 pointsby robitec97last Monday at 2:05 PM17 commentsview on HN

Comments

austinvhuangtoday at 8:00 PM

My first implementation of gemma.cpp was kind of like this.

There's such a massive performance differential vs. SIMD though that I learned to appreciate SIMD (via highway) as one sweet spot of low-dependency portability that sits between C loops and the messy world of GPUs + their fat tree of dependencies.

If anyone want to learn the basics - whip out your favorite LLM pair programmer and ask it to help you study the kernels in the ops/ library of gemma.cpp:

https://github.com/google/gemma.cpp/tree/main/ops

show 1 reply
w4yaitoday at 7:35 PM

> It proves that modern LLMs can run without Python, PyTorch, or GPUs.

Did we need any proof of that ?

show 4 replies
behnamohtoday at 8:14 PM

but why tho? next gemma is coming and no one uses gemma 3 in prod anyway.

show 3 replies