Daily digest
12 items · ~12 min · Week 2026-W23
Dropped as prior-digest duplicates: Microsoft MAI model family (2026-06-02), Anthropic IPO S-1 (2026-06-02), MiniMax M3 (2026-06-02), Qwen3.7-Plus (2026-06-02), MiniMax Hailuo 2.3 (2026-06-03), Ideogram 4.0 (2026-06-04), Suno $400M Series D (2026-06-04). Dropped: Sber GigaChat for Global South — all 3 media sources syndicated from Reuters (counts as 1 independent source, no official; fails verification). Dropped: OpenClaw — only official source is github.com/openclaw/openclaw 2026.6.2-beta meta-release with no media confirmation. Tag warnings (new tags, lenient mode): memory (new). Research note: HuggingFace trending shows AIM paper at 101 upvotes; MLEvolve at 301 upvotes — both above 100-upvote threshold.
Must-read (1)
MLEvolve: Self-Evolving Multi-Agent LLM Framework for Automated ML Algorithm Discovery
MLEvolve is a self-evolving multi-agent LLM framework for automated machine learning algorithm discovery. It introduces Progressive Monte Carlo Graph Search (MCGS) with cross-branch information flow, Retrospective Memory (cold-start knowledge base plus dynamic task-specific memory), and hierarchical planning that decouples strategy from code generation. On MLE-Bench, it achieves state-of-the-art medal rate within a 12-hour budget — half the standard runtime — and outperforms AlphaEvolve on mathematical algorithm optimization tasks. Open-source code is available on GitHub.
Worth knowing (8)
US Congress Releases 269-Page 'Great American AI Act' Draft with 3-Year State Law Preemption
On June 4, 2026, Reps. Jay Obernolte (R-CA) and Lori Trahan (D-MA) released a 269-page bipartisan discussion draft of the Great American AI Act — the first comprehensive US federal AI governance framework. Key provisions: three-year preemption of state AI development laws (with sunset; deployment laws not preempted), formal CAISI establishment, $100M/year for a Center for AI Standards and Innovation, frontier model governance requirements, and workforce impact reporting. The draft has drawn criticism from labor unions and civil society groups over the state preemption scope.
The Deterministic Horizon: Information-Theoretic Proof That Extended CoT Fails and Tool Use Is Necessary
The paper proves an Attention Bottleneck Theorem establishing information-theoretic limits on how far decoder-only transformers can track state in purely neural chain-of-thought. A Deterministic Horizon exists at approximately 19-31 steps beyond which accuracy collapses super-exponentially. Across 12 models and 8 task domains (SWE-Bench, WebArena, SQL-Multi), tool-integrated reasoning achieves 86-94% accuracy versus 24-42% for neural CoT. Fine-tuning improves performance by less than 5%, confirming the limits are architectural, not training-related. Accepted at ICML 2026.
The Self-Correction Illusion: LLMs Fix Others' Errors but Not Their Own — Role Labels Are the Cause
LLMs readily fix errors when presented as external input but fail to correct identical errors framed as their own prior output. The paper isolates the cause: chat-template role labels (user message vs. internal thought vs. tool output vs. system memory), not the content itself. Relabeling an internal erroneous claim as an external source increases explicit correction rates by 23-93 percentage points across 7 model families and 3 domains (p < 0.001 in 10/13 test cells). A prompt-structure intervention requiring no retraining achieves significant improvements.
Audio Interaction Model: Unified Streaming Framework Combining Offline and Real-Time Audio Instruction Following
Researchers from the National University of Singapore published the Audio Interaction Model (AIM), a unified streaming audio framework that combines offline task execution (transcription, translation, music generation) with real-time audio instruction following through an end-to-end architecture. AIM achieves simultaneous low-latency streaming and high-quality offline audio processing without separate models for each task mode, receiving 101 upvotes on HuggingFace Daily Papers.
OpenAI Launches Dreaming V3: Background Memory Synthesis for ChatGPT with 5x Compute Reduction
OpenAIOpenAI began rolling out Dreaming V3 on June 4-5, 2026 — a background process that automatically synthesizes ChatGPT memory from many conversations simultaneously, replacing the manual saved-memories list as ChatGPT's memory foundation. The system prioritizes freshness (auto-updating stale memories), continuity (linking sessions over days or weeks), and relevance filtering. Internal factual-recall evals improved from 41.5% (2024) to 82.8% (2026). A roughly 5x compute reduction makes free-tier rollout viable; Plus and Pro users in the US receive it first.
OpenAI Rolls Out Lockdown Mode to Block Prompt-Injection Exfiltration in ChatGPT
OpenAIOpenAI launched Lockdown Mode on June 5, 2026 — an optional advanced security setting that restricts ChatGPT's outbound network capabilities (web browsing, Deep Research, Agent Mode, file downloads) to block data exfiltration via prompt injection attacks. Available to all logged-in personal accounts (Free, Plus, Pro) and self-serve ChatGPT Business. A companion Elevated Risk label surfaces across ChatGPT, ChatGPT Atlas, and Codex to flag high-risk operations.
xAI Grok Imagine Video 1.5: Image-to-Video with Native Audio Tops Arena Leaderboard, API Now Live
xAIxAI shipped Grok Imagine Video 1.5 as a preview on May 30-31, 2026; the API became available on June 3 at api.x.ai under alias `grok-imagine-video-1.5-2026-05-30`. The model animates a still image (or text prompt) into a clip with native synchronized audio — music, sound effects, and lip-synced dialogue — supporting video extension and reference-guided generation at 720p. At launch it claimed the top position on the Image-to-Video Arena leaderboard with a 52 Elo-point jump over v1.0. Pricing: $0.08/s at 480p, $0.14/s at 720p.
Google Veo 3.1 Brings Audio to All Flow Editing Modes and New Insert/Remove Tools
Google DeepMindGoogle published an official update on June 5, 2026 announcing new Veo 3.1 capabilities inside its Flow video editing platform. The update brings audio generation to previously audio-free features — Ingredients to Video, Frames to Video, and Extend — and introduces precision editing tools including an Insert function that adds new scene elements with realistic lighting, plus an upcoming Remove tool to erase unwanted objects with background reconstruction. Veo 3.1 is also available via the Gemini API and Vertex AI. Over 275 million videos have been created on Flow since launch.
For reference (3)
Sber Launches GigaChat-Powered Multi-Agent Business Assistant for Corporate Banking at SPIEF 2026
SberAt the St. Petersburg International Economic Forum (SPIEF, June 3-6, 2026), Sber announced a new Business Assistant for its SberBusiness mobile app — a conversational AI interface built on GigaChat that replaces traditional internet banking. The system uses a multi-agent architecture with over 160 specialized AI agents covering payments, accounts, analytics, and documentation. A limited advisory version is already handling over 7.5 million queries from more than one million entrepreneurs. Full rollout is planned for autumn 2026.
Claude Code v2.1.166: Fallback Model Config, Expanded Deny-Rule Globs, Cross-Session Security
AnthropicClaude Code v2.1.166 (first seen June 6) adds a `fallbackModel` setting to configure up to three fallback models tried in order when the primary model is overloaded, expanded deny-rule glob support, and hardened cross-session message security. Also disables thinking on models that think by default via `MAX_THINKING_TOKENS=0` and per-model toggles. Fixes a wide range of terminal, auth, session, and UI bugs including recurring JetBrains terminal rendering issues, PowerShell command validation hangs, and voice-mode auth clearing. Two earlier releases on June 5 (v2.1.163, v2.1.165) added `/plugin list` with filtering, `requiredMinimumVersion`/`requiredMaximumVersion` managed settings, and hooks returning `additionalContext`.
OpenCode v1.16: Workspace Cloning, 38% Faster Startup, Snowflake Cortex Provider, Session Replay
OpenCode (SST) released v1.16.0 and v1.16.2 on June 5, 2026. v1.16.0 adds managed workspace cloning that preserves dirty and untracked files, cross-workspace session movement, proper OpenAI model support via AWS Bedrock, skill discovery with file-based agent loading, new color themes and thinking-level selector for desktop, and a `run --replay` mode for interactive session replay. Startup time improved by 38%. v1.16.2 fixes reasoning summaries to only run on providers that support them (avoiding GPT-5 failures), refuses loose edit matches to prevent overwriting wrong code, resolves Bedrock session hangs, adds diff viewer hunk navigation, and adds Snowflake Cortex as a new LLM provider.