Daily digest
5 items · ~5 min · Week 2026-W18
Worth knowing (2)
Baidu releases ERNIE-5.1-Preview — #1 Chinese model on LMArena
BaiduOn April 30, 2026, Baidu unveiled a preview version of ERNIE-5.1-Preview. The model debuted at #13 on the global LMArena Text Arena leaderboard with a score of 1476, becoming the top-ranked Chinese model and overtaking DeepSeek-V4-Pro. According to Baidu, the model uses roughly one-third of the total parameters and half the active parameters of ERNIE-5.0, at approximately 6% of the pre-training cost of comparable models. The full ERNIE 5.1 release is expected at the Baidu Create conference.
OpenAI Codex CLI 0.128.0 — persisted /goal workflows and expanded permission profiles
OpenAIOpenAI shipped a stable release of Codex CLI v0.128.0 following a series of 0.126.x alphas. The headline feature is persisted /goal workflows: long-running goals are stored via the app-server API, exposed as model tools, support runtime continuation, and have dedicated TUI controls. Permission profiles have been expanded with built-in defaults and sandbox-profile selection directly from the CLI; the --full-auto flag is deprecated in favor of explicit permission profiles. Plugin workflows are improved (marketplace install, remote-bundle cache), and external-agent session import with background import has been added. MultiAgentV2 gained configurable thread caps and wait-time.
For reference (3)
AutoResearchBench — a benchmark for autonomous scientific literature search by AI agents
BAAIA new benchmark has been published for evaluating agents on autonomous scientific literature search and review. It includes two complementary setups: Deep Research (multi-step investigation leading to a specific target paper) and Wide Research (exhaustive collection of publications matching given criteria, scored by IoU). Even the strongest LLM agents reach only 9.39% accuracy on Deep Research and 9.31% IoU on Wide Research.
Claude Code 2.1.126 — project purge, model picker via gateway, security fixes
AnthropicAnthropic released Claude Code 2.1.126. A new `claude project purge [path]` command fully wipes state (transcripts, tasks, file history, config). The model picker now pulls the model list from a compatible gateway's /v1/models endpoint when ANTHROPIC_BASE_URL is set. The --dangerously-skip-permissions flag now genuinely bypasses confirmation prompts for writes to protected paths (.claude/, .git/, .vscode/). Regressions in allowManagedDomainsOnly/allowManagedReadPathsOnly have been fixed, and images larger than 2000px are now automatically downscaled on paste.
OpenCode v1.14.31 — interactive Azure setup and permission inheritance for task sessions
SSTSST released opencode v1.14.31 (May 1, 2026). It adds an interactive Azure setup that prompts for the resource name and saves the API key. Task child sessions now inherit permissions from the parent session. Clearer errors are surfaced for invalid remote MCP URLs. A Desktop app crash when restoring sessions with missing models has been fixed.