Daily digest
22 items · ~22 min · Week 2026-W27
Must-read (4)
US Lifts Export Controls on Anthropic's Claude Fable 5 and Mythos 5
AnthropicThe US Department of Commerce rescinded export licensing requirements on Claude Fable 5 and Mythos 5, which had been suspended since June 12 citing national security concerns. Anthropic announced the decision June 30; Fable 5 returned to global users July 1, with Mythos 5 restored to select US organizations. Anthropic agreed to proactively detect security risks and co-author a jailbreak severity scoring framework with Amazon, Microsoft, and Google.
Anthropic Launches Claude Sonnet 5 as New Default Model
AnthropicAnthropic released Claude Sonnet 5 on June 30, making it the default model for Free and Pro users. It delivers near-Opus 4.8 performance across coding, agentic tasks, and professional work with introductory API pricing of $2/$10 per million tokens (input/output) through August 31, 2026.
Orca: BAAI's General World Foundation Model Trained on 125K Hours of Video
BAAIOrca is a general world foundation model from BAAI trained on 125K hours of video and 160M event annotations. It introduces Next-State-Prediction as a unified objective, combining unconscious learning from dense video transitions and conscious learning from language-described events. Evaluated on text generation, image prediction, and embodied action, it outperforms same-scale specialized baselines across all three modalities.
ByteDance Seedance 2.5 Launches Publicly: 30-Second Native 4K Video
ByteDanceByteDance opened public access to Seedance 2.5 in early July via Dreamina and the Volcano Engine API. The model generates a single continuous 30-second clip in native 4K (10-bit color) from one inference call, up from the 10-second ceiling of Seedance 2.0. It accepts up to 50 multimodal reference materials simultaneously (images, video clips, audio), enabling strong character and style consistency, plus local editing for repainting specific regions without regenerating the entire clip.
Worth knowing (11)
Dockerless: Environment-Free Program Verifier for Coding Agents
ByteDanceDockerless is a code patch verifier that assesses correctness through agentic repository exploration instead of executing tests in Docker containers. It outperforms the strongest open-source execution-based verifier by 14.3 AUC points and achieves 62.0% resolve rate on SWE-bench Verified when used for both trajectory filtering and RL reward generation, enabling a fully environment-free post-training pipeline for coding agents.
ByteDance Releases Seed2.0 Model Card for Frontier Model
ByteDanceByteDance's Seed team released the model card for Seed2.0, a frontier model targeting long-tail knowledge and complex instruction following. The model delivers top performance on reasoning, visual understanding, and search tasks, with the evaluation framework grounded in realistic complex scenarios rather than synthetic benchmarks.
DOPD: Dual On-Policy Distillation with Advantage-Aware Token Routing
DOPD addresses the 'privilege illusion' problem in on-policy knowledge distillation by introducing an advantage-aware dual distillation paradigm that routes supervision token-by-token between teacher and student based on their advantage gap. The method consistently improves over standard on-policy distillation across both LLMs and VLMs, with demonstrated gains in continual learning and out-of-distribution robustness.
Anthropic Launches Claude Science, an AI Workbench for Researchers
AnthropicAnthropic launched Claude Science in beta on June 30, a desktop AI workbench integrating Claude with local code execution, 60+ scientific databases (genomics, proteomics, structural biology, cheminformatics), specialist sub-agents, and an automated reviewer agent that audits citations and calculations. Available on macOS and Linux for Pro, Max, Team, and Enterprise users.
Yandex Launches AI Agent Platform for Alice AI
YandexYandex unveiled a platform for building and integrating AI agents into Alice AI, announced June 29. Initial agents include Yandex Taxi (voice-activated ride requests) and Yandex Lavka (grocery ordering). The platform allows agents to understand natural language, plan multi-step action sequences, and factor in context like time and weather. Opening to third-party companies planned for later in 2026.
Claude Code v2.1.198: Claude in Chrome GA and Background Agent Auto-PRs
AnthropicClaude Code v2.1.198 (July 1) ships Claude in Chrome as generally available, enabling agents to drive a browser as part of coding workflows. Background agents launched from `claude agents` now automatically commit their work, push the branch, and open a draft PR when they finish. Also added: background agent notifications, Explore agent inheriting the main session's model, and extended thinking configuration propagation to subagents.
GitHub Copilot Browser Tools Generally Available in VS Code
GitHubBrowser tools for GitHub Copilot in VS Code are generally available as of July 1. Copilot agents can now open pages and navigate, click, type, hover, drag, and capture screenshots of live web apps, feeding findings back into chat context. Privacy safeguards keep browser tabs private until the user explicitly shares them, agent-opened pages run in isolated sessions, and high-risk permissions require explicit user approval.
GitHub Copilot Vision Generally Available for All Plan Tiers
GitHubCopilot vision is now generally available across all subscriber tiers — Free, Pro, Pro+, Business, and Enterprise — as of July 1. Developers can attach images (JPEG, PNG, GIF, WebP) and PDF documents directly to Copilot Chat prompts via paste, drag-and-drop, or right-click in VS Code, github.com, and the Copilot CLI.
vLLM v0.24.0: Model Runner V2 Default, Rust Frontend, SM90 FP8 Speedups
vLLMvLLM v0.24.0 (released ~June 30) incorporates 571 commits from 256 contributors. Model Runner V2 is now the default engine for quantized models as well as Llama and Mistral dense models. The Rust frontend is production-ready with API-key authentication, CORS, and new tokenization endpoints. SM90 CUTLASS FP8 kernels deliver 180–290% kernel speedup on H100-class hardware. DeepSeek-V4 gets FlashInfer sparse-index caching, and new model support includes MiniMax-M3 and DiffusionGemma.
Cursor 3.9: iOS Public Beta and Team MCP Marketplace Expansion
CursorCursor shipped two updates within the window: version 3.9 (June 29) launched a public iOS beta for all paid plans, enabling developers to launch and steer cloud coding agents from mobile via voice input and monitor live activity on the lock screen. On June 30, Team Marketplace support was expanded to include Team MCP servers — admins configure MCP servers once and they distribute automatically to cloud agents, the Agents window, IDE, and CLI.
Google Releases Gemini Omni Flash for Video Generation via API
Google DeepMindGoogle released Gemini Omni Flash on June 30, a multimodal model for video generation and conversational video editing, available via Google AI Studio and the Gemini API at $0.10 per second of video output. Also released to GA: Gemini 3.1 Flash-Lite Image. Both are available on the Gemini Enterprise Agent Platform and ship simultaneously in YouTube Shorts Remix and YouTube Create.
For reference (7)
BlockPilot: Instance-Adaptive Block Size for Diffusion-Based Speculative Decoding
BlockPilot shows that the optimal block size in diffusion-based speculative decoding varies per input and formulates block size selection as a lightweight policy learned from the prefilling representation. Applied to Qwen3-4B, it achieves an acceptance length of 5.92 tokens and a 4.20× inference speedup at temperature T=1, with negligible overhead, and is plug-and-play on top of existing speculative decoding systems.
VK Rolls Out Discovery AI Neural Search Across VK Video, Mail Media, and Dzen
VK AIVK began deploying Discovery AI — a generative AI-powered neural search built on its proprietary LLM — into VK Video, Mail.ru media projects, and Dzen. The system generates personalized search responses in under 0.5 seconds, supports a Deep Research mode for detailed exploration, and adapts to each product's use case. Discovery AI unifies VK's search, recommendation, and personalization infrastructure.
xAI Launches Grok Voice Agent Builder for No-Code Voice AI Deployment
xAIxAI launched Voice Agent Builder on July 1, a no-code platform that consolidates speech-to-text, language model inference, and text-to-speech into a single interface for building production voice agents on Grok Voice. The platform includes built-in telephony, knowledge retrieval, tool/MCP support, guardrails, and observability at $0.05 per minute, supporting 25+ languages.
Ollama v0.31.1: Gemma 4 Nearly 90% Faster on Apple Silicon via MTP
OllamaOllama v0.31.1 (June 30) delivers approximately 90% faster Gemma 4 token generation on Apple Silicon via multi-token prediction (MTP) with automatic tuning enabled by default — no configuration required. The release also updates the MLX engine with a new small-batch matrix multiplication kernel and upgrades the llama.cpp backend to build 9840.
OpenCode v1.17.13: Reasoning Mode Fixes and Searchable Model Picker
SSTOpenCode v1.17.13 (July 1) improves core reasoning mode for OpenAI-compatible models and fixes stale response handling in the GitHub Copilot provider. The desktop client gains a searchable model picker, session tab hover previews, and a streamlined WSL server setup flow. Session isolation is improved so a failure on one session page no longer brings down others.
OpenAI Codex v0.142.5: Security Fix for Trace Log WebSocket Exposure
OpenAICodex v0.142.5 (July 1) patches a security issue where full Responses WebSocket request payloads could be written to trace logs, potentially exposing sensitive request data including code, file paths, and credentials in locally stored trace files. No user-facing feature changes in this release.
VKontakte Deploys LLM and VLM Models for In-Feed Product Recommendations
VK AIVK's AI engineers deployed LLM- and VLM-based models into VKontakte's main feed and VK Clips to improve recommendations for creator shop content. The models analyze all user interactions alongside broader interest signals. Results after launch: 5× higher click-through on product cards, 15× more marketplace navigation, and 20× growth in orders from author shops.