They also shipped Gemma models with their new Matformer architecture which allows for dynamic computation.
https://arxiv.org/pdf/2310.07707v2