AI Digest

A daily roundup of significant releases and events in AI, with an emphasis on source verifiability.

Baidu releases ERNIE-5.1-Preview — #1 Chinese model on LMArena

Baidu
Models / LLM official + media 2 src. ~1 min

On April 30, 2026, Baidu unveiled a preview version of ERNIE-5.1-Preview. The model debuted at #13 on the global LMArena Text Arena leaderboard with a score of 1476, becoming the top-ranked Chinese model and overtaking DeepSeek-V4-Pro. According to Baidu, the model uses roughly one-third of the total parameters and half the active parameters of ERNIE-5.0, at approximately 6% of the pre-training cost of comparable models. The full ERNIE 5.1 release is expected at the Baidu Create conference.

Why it matters
Confirms the sharp acceleration of the Chinese race following DeepSeek V4: Baidu claims leadership among Chinese labs on LMArena at substantially lower training cost.

OpenAI Codex CLI 0.128.0 — persisted /goal workflows and expanded permission profiles

OpenAI
Tools official 2 src. ~1 min

OpenAI shipped a stable release of Codex CLI v0.128.0 following a series of 0.126.x alphas. The headline feature is persisted /goal workflows: long-running goals are stored via the app-server API, exposed as model tools, support runtime continuation, and have dedicated TUI controls. Permission profiles have been expanded with built-in defaults and sandbox-profile selection directly from the CLI; the --full-auto flag is deprecated in favor of explicit permission profiles. Plugin workflows are improved (marketplace install, remote-bundle cache), and external-agent session import with background import has been added. MultiAgentV2 gained configurable thread caps and wait-time.

Why it matters
Persisted /goal turns Codex CLI from a stateless helper into a platform for long-lived autonomous tasks, competing with Claude Code and Cursor for background agents.

AutoResearchBench — a benchmark for autonomous scientific literature search by AI agents

BAAI
Research official + media 2 src. ~1 min

A new benchmark has been published for evaluating agents on autonomous scientific literature search and review. It includes two complementary setups: Deep Research (multi-step investigation leading to a specific target paper) and Wide Research (exhaustive collection of publications matching given criteria, scored by IoU). Even the strongest LLM agents reach only 9.39% accuracy on Deep Research and 9.31% IoU on Wide Research.

Why it matters
Closes a methodological gap between general-purpose web agents and the actual work of a researcher; the ~9% figures set a ceiling against which progress on research agents can be measured throughout 2026.
Full issue →

DeepSeek V4: official open-source release with Day-0 adaptation for Huawei Ascend

DeepSeek
Models / LLM official + media 5 src. ~1 min

DeepSeek officially released the V4 lineup in open-source under the MIT license on April 29. It includes DeepSeek-V4-Pro at 1.6T parameters (49B active) and DeepSeek-V4 at 284B (13B active) — both MoE models with native 1M token context. The release claims roughly a 9.5x reduction in memory requirements versus V3.2 and a near-closed gap with frontier closed models on reasoning benchmarks. A defining feature of the release is optimization for Chinese accelerators: Huawei Ascend, Cambricon, Hygon, and Moore Threads completed Day-0 adaptation on release day, with multi-deploy on Ascend 950 expected in the second half of the year.

Why it matters
The first major frontier open-weights release purpose-built for Ascend rather than Nvidia — an infrastructure shift for the Chinese AI stack and a signal that US export restrictions have accelerated the formation of a self-sufficient inference ecosystem.

GLM-5V-Turbo: a natively multimodal foundation model for agents

Z.ai
Research official + media 2 src. ~1 min

Z.ai unveiled GLM-5V-Turbo, a multimodal foundation model in which visual perception is embedded as a first-class component of reasoning, planning, and tool use rather than bolted on after the fact. The model handles images, video, web pages, and documents; the authors report gains on multimodal coding, visual tool use, and agent tasks while preserving text-only quality. The role of end-to-end verification of agent trajectories during training is emphasized.

Why it matters
One of the most-hyped releases of the week on HF Daily — 2.28k upvotes. A bid for a natively multimodal agent (rather than a VLM with tacked-on tool use) — a direction in which Z.ai is systematically competing with GPT-5 and Gemini.

Yandex Commerce Protocol: first retailers launch sales via Alice AI

Yandex
Industry official + media 5 src. ~1 min

Yandex disclosed the first partners of the Yandex Commerce Protocol (YCP) — a standard for integrating online stores with AI scenarios in Alice AI, Search, and Yandex Ritm. Going live with sales directly from chat with Alice AI are Stockmann, restore:, pharmacy chains Gorzdrav and 36.6, telecom operator Beeline, the brand The Act, and a number of other retailers; over 200 large online retailers and brands have begun YCP integration, and more than 1,600 additional stores have applied. The technology lets shoppers proceed to checkout directly from the assistant dialog without visiting the merchant's website — Alice AI acts as a transactional AI agent on top of partner catalogs.

Why it matters
YCP is Yandex's bid to be the AI-commerce standard in the Russian-language internet and one of the first large-scale launches of an LLM assistant as a direct sales channel in Russia. If the protocol catches on, it shifts the role of voice and chat assistants from informational to transactional.
Full issue →

Anthropic launches Claude for Creative Work with connectors to Adobe, Blender, Ableton

Anthropic
Tools official + media 4 src. ~1 min

Anthropic announced the Claude for Creative Work bundle — nine official connectors that let Claude work directly with Adobe Creative Cloud, Blender, Autodesk Fusion, Ableton Live/Push, Affinity by Canva, Resolume, SketchUp, and Splice. In parallel, Anthropic Labs launched a new product, Claude Design, for rapid visual prototyping, and announced education programs with RISD, Ringling, and Goldsmiths.

Why it matters
Anthropic is moving beyond the "code and text assistant" niche into professional creative pipelines — for the first time, a major frontier lab gets an official place inside Adobe and Blender.

OpenAI brings GPT-5.5, Codex, and Managed Agents to Amazon Bedrock

OpenAI
Industry official + media 6 src. ~1 min

AWS and OpenAI expanded their partnership and launched three offerings on Amazon Bedrock in limited preview: OpenAI's frontier models (GPT-5.5 and GPT-5.4), the Codex agent with CLI/desktop/VS Code support, and Bedrock Managed Agents based on OpenAI. GA is promised within weeks; the models are integrated with IAM, PrivateLink, guardrails, and CloudTrail.

Why it matters
The release came a day after the end of OpenAI's exclusivity with Microsoft and effectively makes Bedrock a second full-fledged distribution channel for OpenAI's frontier models in the enterprise.

Mistral releases Medium 3.5 — 128B dense, 256k context, open weights

Mistral
Models / LLM official + media 5 src. ~1 min

Mistral AI introduced Mistral Medium 3.5 — a flagship dense model with 128B parameters, 256k context, and switchable reasoning effort. Weights are open under a modified MIT license and available on Hugging Face. In parallel, the company launched remote agents in Vibe (cloud coding sessions with CLI and "teleportation" of a local session into the cloud) and a Work mode in Le Chat for multi-step tasks. Claimed scores: 77.6% on SWE-Bench Verified and 91.4% on τ³-Telecom; API pricing is $1.5/$7.5 per million tokens.

Why it matters
Mistral returns to the frontier with a cheap open-weight model on par with Claude Sonnet 4.5 in coding, while also offering its own analog of Codex/Claude Code — the strongest European release of spring 2026.
Full issue →

Firefly AI Assistant — Public Beta

Adobe
Image official + media 3 src. ~1 min

On April 27, Adobe launched the global public beta of an AI assistant that orchestrates multi-step creative workflows across 60+ Creative Cloud tools via chat prompts; includes Creative Skills and integration with partner models (GPT Image 2, Veo 3.1, Runway Gen-4.5, ElevenLabs Multilingual v2).

Sora — final shutdown

OpenAI
Video official 1 src. ~1 min

On April 26, the Sora web and mobile apps were permanently shut down; the API will be discontinued on September 24, 2026.

Full issue →

Microsoft–OpenAI restructuring

Microsoft / OpenAI
Industry official + media 3 src. ~1 min

End of cloud exclusivity: OpenAI can sell products via AWS/Google Cloud; Microsoft license becomes non-exclusive. Microsoft remains the primary cloud partner and stops paying OpenAI a revenue share. OpenAI continues sharing revenue with Microsoft through 2030; IP license runs through 2032.

OpenClaw 2026.4.25

OpenClaw
Tools official 2 src. ~1 min

TTS expansion (`/tts latest`, new providers including Azure Speech). Plugin registry moved to cold storage for faster startup. Expanded OpenTelemetry monitoring. Calendar versioning `YYYY.M.D`.

Full issue →