Daily digest
21 items · ~21 min · Week 2026-W20
Must-read (4)
OpenAI Launches $4B Deployment Company, Acquires Tomoro
OpenAIOpenAI launched the OpenAI Deployment Company on May 11, 2026, a majority-OpenAI-owned venture backed by $4 billion from 19 investment firms including TPG, Bain Capital, and McKinsey. Simultaneously, OpenAI agreed to acquire Edinburgh-based Tomoro, an applied AI consulting firm, to staff the new company with approximately 150 Forward Deployed Engineers from day one. The Deployment Company's mandate is to embed FDEs inside enterprises to redesign workflows around frontier AI.
Thinking Machines Lab Unveils TML-Interaction-Small: 276B MoE Real-Time Multimodal Model
Thinking Machines LabThinking Machines Lab (founded by former OpenAI CTO Mira Murati) released a research preview of TML-Interaction-Small on May 11, 2026 — a 276B-parameter MoE model (12B active) using a 200ms micro-turn architecture to process audio, video, and text simultaneously without wait turns. On FD-bench v1.5, it achieves sub-400ms turn-taking latency, beating Gemini-3.1-flash-live and GPT-realtime-2.0. Access is limited to research partners.
SenseNova-U1: Open-Source Unified Multimodal Understanding and Generation via NEO-unify
SenseTimeSenseNova-U1 proposes NEO-unify, an architecture that eliminates both visual encoders and VAEs to natively unify image understanding and generation from first principles. Two model variants (8B dense and 30B MoE) achieve performance rivaling top understanding-only VLMs while simultaneously generating images at a 32× compression ratio. Weights and code are fully open-sourced.
Codex-Spark (GPT-5.3-Codex-Spark) Research Preview: 1000+ Tokens/Second Coding Model
OpenAIOpenAI released GPT-5.3-Codex-Spark as a research preview for ChatGPT Pro users in the Codex app, CLI, and VS Code extension. The model is optimized to exceed 1000 tokens per second with a 128k context window, enabling real-time interruption and redirection while the model is generating. API access is rolling out to a small set of design partners.
Worth knowing (8)
Alibaba Integrates Qwen AI with Taobao to Launch Agentic Conversational Shopping
AlibabaAlibaba announced on May 11, 2026 plans to deeply integrate its Qwen AI platform with Taobao and Tmall, giving the Qwen app direct access to over four billion product listings so users can browse, compare, and purchase via natural-language conversation rather than keyword search. Taobao will also launch a Qwen-powered shopping assistant featuring virtual try-ons, a 30-day price-tracking tool, and an agent skills library covering logistics and after-sales services.
Baidu Releases ERNIE 5.1 at 6% of Industry Pre-Training Cost, Enters Global Top-10 Search
BaiduBaidu officially released ERNIE 5.1 on May 8–9, 2026, compressing total parameters to one-third and active parameters to one-half compared to ERNIE 5.0 while reducing pre-training costs to approximately 6% of comparable industry models. The model ranked 4th globally on the LMArena Search Leaderboard with a score of 1,223, making it the only Chinese model in the global top 10 for search. Baidu showcased ERNIE 5.1 further at its Create 2026 developer conference in Beijing on May 13–14.
RubricEM: Meta-RL with Rubric-Guided Policy Decomposition Beyond Verifiable Rewards
GoogleRubricEM proposes using rubrics as a shared interface that structures policy execution, judge feedback, and agent memory across the full research-agent lifecycle. The framework combines stagewise policy decomposition with a novel Stage-Structured GRPO objective for denser semantic rewards during long-horizon tasks. RubricEM-8B matches proprietary deep-research systems on four long-form research benchmarks.
Claude Platform on AWS Reaches General Availability
AnthropicAWS announced general availability of Claude Platform on AWS on May 11, 2026, making it the first cloud provider to offer Anthropic's native Claude Platform experience through existing AWS accounts. Customers authenticate via IAM, receive unified billing on a single AWS invoice, and get access to Claude Managed Agents, web search, code execution, Files API, Skills, MCP connectors, prompt caching, and citations — all operated by Anthropic outside the AWS security boundary. The service is available across 18 global regions.
OpenAI Launches Daybreak: AI-Powered Vulnerability Detection Platform
OpenAIOpenAI announced Daybreak on May 12, 2026, a cybersecurity platform combining GPT-5.5 model variants and Codex Security to help organizations identify, validate, and remediate software vulnerabilities before attackers exploit them. The platform offers three GPT-5.5 tiers — standard, Trusted Access for Cyber for vetted defenders, and GPT-5.5-Cyber for red teaming — with capabilities spanning secure code review, threat modeling, patch validation, and dependency analysis. Major security vendors including Akamai, Cisco, Cloudflare, CrowdStrike, and Palo Alto Networks are already integrating Daybreak.
Google DeepMind Unveils Magic Pointer: AI-Aware Mouse Cursor for Chrome and Googlebook
Google DeepMindGoogle DeepMind published research on May 12, 2026 reimagining the mouse pointer as an AI-aware interface that captures visual and semantic context around the cursor, enabling users to point at on-screen content and issue short natural-language commands without switching apps or typing full prompts. Two interactive demos are live in Google AI Studio; the feature is coming to Chrome's Gemini assistant and to Googlebook, Google's new line of Gemini-powered laptops.
Anthropic Launches Claude for Legal With 12 Plugins and 20+ MCP Connectors
AnthropicAnthropic formally launched Claude for Legal on May 12, 2026, releasing 12 practice-area plugins covering commercial, corporate, employment, privacy, IP, and litigation workflows, along with over 20 MCP connectors linking Claude Cowork to legal software including DocuSign, iManage, NetDocuments, Westlaw, and Box. Major law firms Freshfields, Quinn Emanuel, and Holland & Knight are already using Claude on live matters, and Anthropic reported legal as the top function in Claude Cowork with three times the usage of any other job category.
Gemini Omni Video Model Surfaces Ahead of Google I/O 2026
Google DeepMindOn May 11, 2026, a new model card labeled 'Omni' surfaced within the Gemini app UI, described as a video model that supports in-chat editing, video remixing, and template generation. Early demo outputs showed strong text rendering in video and complex scene composition; metadata suggests Omni is an extension of Google's Veo line. The model had not been officially announced, with Google I/O 2026 (May 19–20) expected as the formal unveil.
For reference (9)
OpenAI DALL-E 2 and DALL-E 3 APIs Shut Down on May 12
OpenAIOpenAI's DALL-E 2 and DALL-E 3 API endpoints were permanently shut down on May 12, 2026 as scheduled in the deprecation notice issued November 2025. After the cutoff, requests using the dall-e-2 or dall-e-3 model strings return errors with no automatic fallback. OpenAI recommends migration to gpt-image-1.5 or gpt-image-1-mini as replacements.
World Action Models: First Systematic Survey of Embodied Foundation Models Unifying World Modeling and Action
OpenMOSSThis survey defines World Action Models (WAMs) as embodied foundation models that unify predictive state modeling with action generation, addressing the limitation of Vision-Language-Action models that learn reactive mappings without explicitly modeling environmental dynamics. The paper provides the first formal taxonomy distinguishing Cascaded and Joint WAM variants, and analyzes data sources, training protocols, and evaluation challenges.
Learning, Fast and Slow: Dual-Weight Architecture for Continual LLM Adaptation
Inspired by dual-process cognitive theory, this paper proposes Fast-Slow Training (FST) where model parameters serve as slow weights and optimized context serves as fast weights. FST achieves up to 3x greater sample efficiency over parameter-only fine-tuning on reasoning tasks while maintaining significantly lower divergence from the base model, reducing catastrophic forgetting in sequential task settings.
Claude Code v2.1.139–v2.1.140: Agent View, /goal Command, and PostToolUse Hook Output
AnthropicAnthropic shipped two Claude Code releases on May 11–12. v2.1.139 added a Research Preview 'agent view' (claude agents lists all sessions), a /goal command that keeps the agent working until a defined condition is met, and PostToolUse hook output replacement. v2.1.140 followed with case-insensitive Agent subagent_type matching, fixes for /goal hanging when hooks are disabled, and symlinked settings hot-reload.
GitHub Copilot CLI v1.0.45: /autopilot and /fork Slash Commands
GitHubGitHub Copilot CLI v1.0.45 (May 11, 2026) adds a /autopilot slash command to toggle between interactive and fully autonomous modes mid-session, a /fork command to branch the current session into an independent new session, and aligns OpenTelemetry output with GenAI semantic conventions. Startup time improved by approximately 1.5 seconds on terminals with limited OSC color support.
OpenClaw v2026.5.12-beta: Subagent Session Nesting and 20-Turn Agent-to-Agent Ping-Pong
OpenClaw shipped three beta releases on May 12–13 (beta.2 through beta.4). Key additions include nesting subagent sessions under their parent in the session picker, expanding agent-to-agent communication to allow up to 20 ping-pong turns, per-sender tool policies, and enhanced Slack integration with reply broadcasting and link-preview suppression.
vLLM v0.21.0rc1: PyTorch 2.11, HuggingFace Transformers v5, and Python 3.14 Support
vLLM published v0.21.0rc1 on May 12, 2026, advancing the baseline to PyTorch 2.11 and HuggingFace Transformers v5, and adding Python 3.14 to the supported versions. The RC follows the v0.20.2 patch (May 10) which stabilized DeepSeek V4 support and fixed KV block allocation errors in the V1 engine.
OpenCode v1.14.47–v1.14.48: Full-Resolution Image Attachments and Keybinding Fixes
SSTSST released OpenCode v1.14.47 (May 11) restoring prompt-editing keybindings in the TUI textarea, making model selections persist across sessions, and adding configurable large-image auto-resize. v1.14.48 changed the agent to preserve original image attachments at full resolution instead of resizing before sending to the model.
Ollama v0.23.3: MLX Runner Fixes and macOS 26 Metal Compatibility
OllamaOllama v0.23.3 (May 12, 2026) fixes a status timeout during MLX inference, addresses macOS 26 target leakage in the Metal library compilation, and refines ImageGen runner behavior with MLX thread affinity optimization. This follows v0.23.2 (May 7) which added 6.7x faster /api/show response times via API caching.