Daily digest
18 items · ~18 min · Week 2026-W26
Must-read (3)
ByteDance Launches Doubao-Seed-2.1-Pro at Volcano Engine FORCE Conference
ByteDanceByteDance unveiled Doubao-Seed-2.1-Pro on June 23 at the Volcano Engine FORCE conference in Beijing — a production-level frontier LLM for coding, long-horizon agentic tasks, and multimodal understanding. Also released: Doubao-Seed-2.1-Turbo at half the price (6 yuan per million input / 30 yuan per million output tokens for Pro). ByteDance claims parity with GPT-5.5 on coding and agent benchmarks, topping OSWorld, MobileWorld, and MMMU-Pro. The Doubao family now exceeds 180 trillion daily token calls — up 10x year-over-year.
Anthropic Launches Claude Tag: A Persistent AI Teammate for Slack
AnthropicAnthropic launched Claude Tag in beta on June 23, 2026, for Claude Enterprise and Team customers. It adds Claude as a persistent, multiplayer Slack team member that users can @-mention to delegate tasks. Claude learns from channel history over time, can work asynchronously, and — when ambient mode is enabled — proactively flags relevant information without being prompted. The feature runs on Claude Opus 4.8 and replaces the existing Claude for Slack app. Anthropic reports that an internal version already generates 65% of its product team's code.
OpenAI Expands Daybreak with Full GPT-5.5-Cyber Release, Codex Security Plugin, and Patch the Planet
OpenAIOn June 22, 2026, OpenAI expanded its Daybreak cybersecurity platform with the full release of GPT-5.5-Cyber (scoring 85.6% on CyberGym — the highest single-model result to date), a Codex Security plugin for finding and patching vulnerabilities within developer workflows, and 'Patch the Planet' — an open-source initiative co-founded with Trail of Bits. Access to GPT-5.5-Cyber remains restricted to verified defenders. The Cyber Partner Program now includes over 20 vendors including Cisco, CrowdStrike, Palo Alto Networks, and Cloudflare; over 30 open-source projects including cURL, Go, and Python have committed to Patch the Planet.
Worth knowing (9)
Krea Releases Krea 2 Raw and Turbo Open Weights: 12B DiT Image Model Generating in 2 Seconds
KreaKrea released open weights for Krea 2 on June 22, 2026 via Hugging Face under a custom community license (commercial use requires enterprise agreement for organizations with 50+ seats). Two variants: Krea 2 Raw (pre-RLHF base checkpoint from mid-training) and Krea 2 Turbo (distilled, post-trained). The 12B Diffusion Transformer generates images in approximately 2 seconds with Turbo. Krea reports 30 million users across 191 countries.
Google DeepMind and A24 Announce $75M AI Research Partnership for Filmmaking
Google DeepMindGoogle DeepMind invested $75 million into film studio A24 and announced a multi-year, non-exclusive research and development partnership on June 22, 2026. DeepMind researchers will work alongside A24 filmmakers on active productions to develop AI-powered workflows, with Veo as the central technology. This is Google's first-ever equity stake in a film studio.
Yandex Self-Driving Truck Completes First Fully Autonomous 700km Moscow–Saint Petersburg Run
YandexOn June 23, 2026, Yandex's Robotrak autonomous truck completed a 700km fully driverless journey from Moscow to Saint Petersburg along the M-11 highway — the first such feat in Russia. The AI-powered system handled overtaking, road construction zones, and toll plazas at approximately 90 km/h. A safety driver was present but did not touch the controls. Yandex published an uncut 8-hour video log of the trip.
Prime Intellect Releases prime-rl v0.6.0 for Agentic RL on Trillion-Parameter MoE Models
Prime IntellectPrime Intellect released prime-rl v0.6.0 (June 22–23, 2026), an open-source framework for asynchronous reinforcement learning on trillion-parameter MoE models targeting long-horizon agentic tasks like software engineering. The framework decouples trainer and inference into independent async processes. A GLM-5 demonstration ran SWE tasks at 131K sequence length with sub-5-minute step times and 256 rollout batch size on only 28 H200 nodes. Router replay cuts KL mismatch between trainer and inference by roughly 10x.
Qwen-AgentWorld: Language World Models for General Agents across Seven Environments
Alibaba/QwenAlibaba's Qwen team published Qwen-AgentWorld (arXiv 2606.24597, June 23), introducing language world models — 35B-A3B and 397B-A17B MoE variants — that simulate seven agentic environments: MCP, Search, Terminal, Software Engineering, Android, Web, and OS. Trained on over 10 million real environment interaction trajectories. Also introduces AgentWorldBench covering all seven domains. The models can serve as scalable RL training simulators or as warm-up training for downstream agent tasks.
Sakana AI Releases Fugu: Multi-LLM Orchestrator Achieving SoTA on SWE-Bench Pro
Sakana AISakana AI published the Fugu Technical Report (arXiv 2606.21228, revised June 23, 2026). Fugu is a family of orchestrator models trained to coordinate an adaptive team of specialized LLMs, dynamically devising agent scaffolds tailored to each query via fine-tuning, evolutionary algorithms, and RL. Two variants: Fugu (performance/latency balance) and Fugu-Ultra (maximum quality). Achieves state-of-the-art results on SWE-Bench Pro, Terminal Bench, LiveCodeBench, and GPQA-Diamond among publicly accessible models.
Mistral Releases OCR 4 with Bounding Boxes, Block Classification, and 170-Language Support
MistralMistral published OCR 4 on June 23, 2026. New capabilities include per-word bounding boxes, typed block classification (titles, tables, equations, signatures), and per-word confidence scores — enabling source-grounded citations and spatial indexing. The model supports 170 languages across 10 language groups, handles PDF, DOC, PPT, and OpenDocument formats, and runs self-hosted in a single container. On OlmOCRBench it scores 85.20 (top overall) and 93.07 on OmniDocBench. Pricing: $4/1,000 pages via API, $2 with Batch API.
xAI Launches /goal in Grok Build for Long-Running Autonomous Coding Tasks
xAIxAI shipped a new /goal command in Grok Build on June 22, 2026, enabling long-running autonomous task execution in its terminal-based coding agent. When invoked, the agent creates a progress checklist, then works through it step by step — including code review, webpage inspection, and script execution — until the task is completed and verified. The feature uses a multi-model architecture combining Composer 2.5 and Grok Build 0.1. Access is currently limited to SuperGrok Heavy subscribers ($300/month).
ByteDance Previews Seedance 2.5: Native 4K, 30-Second Video with 50 Reference Inputs
ByteDanceAlso at the June 23 Volcano Engine FORCE conference, ByteDance previewed Seedance 2.5, its next-generation video model. The model generates native 30-second single-clip video at 4K resolution with 10-bit color depth, and accepts up to 50 multimodal reference inputs simultaneously — images, audio, 3D models, style references — compared to 12 in the previous version. Post-generation local editing preserves visual style. The model is in global enterprise beta; public launch is targeted for early July 2026.
For reference (6)
SHERLOC: Structured Diagnostic Localization Cuts Code Repair Token Usage by 36.7%
SHERLOC (arXiv 2606.24820, June 23) is a training-free framework addressing fault localization in repository-level code repair. It pairs a reasoning LLM with compact repository tools and a self-recovery mechanism to produce structured diagnostic outputs. Achieves 84.33% accuracy@1 on SWE-Bench Lite while reducing total token usage by 36.7%, and improves downstream repair agent resolve rate by 5.95 percentage points.
GitHub Copilot CLI Redesigned Terminal Interface Reaches General Availability
GitHubThe redesigned GitHub Copilot CLI terminal interface, previewed at Microsoft Build 2026, is now generally available. It introduces a tabbed layout (Session, Gists, Issues, Pull Requests) for navigating GitHub directly from the terminal, guided in-session tool configuration via `/mcp add`, `/skills`, and `/plugin` commands instead of manual file editing, and theme-aware accessible colors with screen reader support.
Yandex Alice AI Gains Agentic Booking for Restaurants and Beauty Salons Across Russia
YandexYandex launched an AI agent booking capability inside Alice AI chat on June 23, 2026. Users can now book restaurant tables and beauty salon appointments via natural-language conversation, covering over 30,000 restaurants and 40,000 service businesses nationwide. For venues connected to Yandex Eats, bookings confirm automatically; for others, Alice fills out reservation forms on the venue's website. Available in alice.yandex.ru, the Alice AI app, Yandex Browser, and the main Yandex app.
Claude Code v2.1.187: Sandbox Credential Isolation and Remote MCP Hang Fix
AnthropicClaude Code v2.1.187 (June 23) adds a `sandbox.credentials` setting that blocks sandboxed commands from reading credential files and secret env vars, adds org-configured model restrictions to the model picker, and fixes remote MCP tool calls that previously hung for up to 5 minutes before aborting.
Cursor 3.9 Launches Unified Customize Page for Plugins, Skills, MCPs, and Subagents
CursorCursor 3.9 (June 22) consolidates plugins, skills, MCPs, subagents, rules, commands, and hooks into a single Customize page manageable at user, team, or workspace scope. A marketplace leaderboard surfaces the most popular extensions across a team with one-click installation. Plugins now support prebuilt canvases (e.g., Hex Canvas for data visualizations, Atlassian Canvas for live issue tracking). Team marketplaces expanded to import plugin repos from GitLab, BitBucket, and Azure DevOps.
Modal Launches Auto Endpoints for Production-Grade Open-Model LLM Inference
ModalModal published Auto Endpoints on June 23, 2026. The product deploys optimized, OpenAI API-compatible LLM inference endpoints with a single command, selecting GPU type, region, and inference engine flags automatically, while keeping the full serving code visible and editable. It includes speculative decoding with custom drafter models. The backing Modal App is fully inspectable and forkable.