Google DeepMind Releases Gemma 4 12B: Encoder-Free Multimodal Model That Runs on a 16 GB Laptop
Google DeepMind
Google DeepMind released Gemma 4 12B on June 3, 2026 — an open-weights, encoder-free multimodal model that natively ingests audio, video, and images, runs locally on a 16 GB VRAM laptop, and is licensed under Apache 2.0. It is the first medium-sized model with built-in native audio understanding and is designed to power fully local agentic workflows via the Google AI Edge stack.
Why it matters
Brings frontier-grade multimodal and audio capabilities to consumer hardware without cloud dependency; first encoder-free design at this scale.
Importance: 4/5
Frontier open-weight multimodal model from Google DeepMind with native audio support; official Google blog and two independent media outlets.