#gemma
- Google DeepMind Releases Gemma 4 12B: Encoder-Free Multimodal Model That Runs on a 16 GB Laptop Google DeepMind models-llm
- Google Releases DiffusionGemma: 26B Open Model with 4× Faster Text Generation Google DeepMind models-llm
- Google DeepMind Releases Gemma 4 QAT Checkpoints: Sub-1 GB On-Device E2B Model Google DeepMind models-llm
- vLLM v0.23.0: Model Runner V2 Default for Llama and Mistral, Transformers v5, Multi-Tier KV Cache tools
- Ollama v0.23.1: Gemma 4 MTP Speculative Decoding Delivers 2× Speed on Apple Silicon tools