Google DeepMind Releases Gemma 4 12B: Encoder-Free Multimodal Model That Runs on a 16 GB Laptop

Google DeepMind

Models / LLM official + media 3 src. ~1 min

Google DeepMind released Gemma 4 12B on June 3, 2026 — an open-weights, encoder-free multimodal model that natively ingests audio, video, and images, runs locally on a 16 GB VRAM laptop, and is licensed under Apache 2.0. It is the first medium-sized model with built-in native audio understanding and is designed to power fully local agentic workflows via the Google AI Edge stack.

Why it matters

Brings frontier-grade multimodal and audio capabilities to consumer hardware without cloud dependency; first encoder-free design at this scale.

Importance: 4/5

Frontier open-weight multimodal model from Google DeepMind with native audio support; official Google blog and two independent media outlets.

gemma open-weights multimodal on-device release

Sources

official Introducing Gemma 4 12B — Google Blog

media Google's new open source Gemma 4 12B — VentureBeat

media Google DeepMind's Gemma 4 12B — The Decoder