Thinking Machines Lab Unveils TML-Interaction-Small: 276B MoE Real-Time Multimodal Model

Thinking Machines Lab

Models / LLM official + media 3 src. ~1 min

Thinking Machines Lab (founded by former OpenAI CTO Mira Murati) released a research preview of TML-Interaction-Small on May 11, 2026 — a 276B-parameter MoE model (12B active) using a 200ms micro-turn architecture to process audio, video, and text simultaneously without wait turns. On FD-bench v1.5, it achieves sub-400ms turn-taking latency, beating Gemini-3.1-flash-live and GPT-realtime-2.0. Access is limited to research partners.

Why it matters

The micro-turn architecture demonstrates that real-time interruption and multi-modal co-presence can be achieved natively within the model rather than via external streaming scaffolding — this is the first public model from Mira Murati's post-OpenAI lab.

Importance: 4/5

First model from Mira Murati's Thinking Machines Lab; novel micro-turn real-time multimodal architecture

multimodal real-time speech streaming preview

Sources

official Interaction Models: A Scalable Approach to Human-AI Collaboration | Thinking Machines Lab

media Thinking Machines drops new real-time interaction model — SiliconAngle

media Thinking Machines Lab Ships First Model With 200ms Real-Time Interaction — Unite.AI