Daily digest
12 items · ~12 min · Week 2026-W25
Must-read (3)
Zhipu AI Open-Sources GLM-5.2 Under MIT License with 1M Token Context
Zhipu AIZhipu AI released the open weights of GLM-5.2 on HuggingFace under an MIT license around June 16, 2026. The model is built on a 753B MoE architecture with a 1-million-token context window, coding-first positioning, and a dual thinking-effort system with no regional restrictions, hosted at zai-org/GLM-5.2.
VibeThinker-3B Reaches Frontier-Level Reasoning Benchmarks via Curriculum RL
WeiboAIVibeThinker-3B (arXiv 2606.16140, June 15) achieves 94.3 on AIME26 (97.1 with test-time scaling), 80.2 Pass@1 on LiveCodeBench v6, and 96.1% acceptance on unseen LeetCode contests using curriculum SFT, multi-domain RL, and offline self-distillation on a 3B dense model. Authors propose the Parametric Compression-Coverage Hypothesis: reasoning compresses into compact models while broad factual knowledge requires larger parameter counts.
JoyAI-VL-Interaction: Open-Source 8B Real-Time VLM with Autonomous Turn-Taking
JD.comJoyAI-VL-Interaction (arXiv 2606.14777) is an 8B VLM for continuous real-time video interaction: it watches a live video stream and autonomously decides when to speak or stay silent. Released with training recipe, time-aligned interaction data, and a fully deployable open-source system (pluggable ASR/TTS, memory, background agent API). Human raters preferred it over Doubao and Gemini in-app assistants across six real-world scenarios.
Worth knowing (5)
Alibaba Releases Qwen-RobotSuite: Three Embodied AI Foundation Models
Alibaba / QwenAlibaba's Qwen team released Qwen-RobotSuite on June 16–17, 2026: Qwen-RobotManip (VLA for robotic manipulation, trained on 38,100+ hours of data), Qwen-RobotNav (navigation and instruction-following), and Qwen-RobotWorld (world model for physically consistent future states). RobotManip and RobotNav ship with public GitHub repositories.
Anthropic Study: Domain Expertise Drives Agentic Coding Success, Not Programming Background
AnthropicAnthropic published an analysis of ~400,000 Claude Code sessions from ~235,000 users (Oct 2025–Apr 2026). Domain expertise — not coding background — is the primary predictor of success: expert-rated sessions succeed at 30%+ vs 15% for novices, and non-software professionals (legal, finance, management) succeed at nearly the same rate as engineers. Average task value rose ~27% over 7 months as task scope shifted from debugging toward deployment, data analysis, and document writing.
xAI Launches Grok for PowerPoint as Free Microsoft 365 Add-in
xAIxAI released a free Microsoft 365 add-in integrating Grok into PowerPoint on June 16. Users can generate full slide decks from text prompts, restructure slides, and apply styling in natural language. The add-in connects to live X and web search and can pull from SharePoint, email, and Google Drive via Grok connectors. PowerPoint is the first Office app; Word and Excel integrations are planned.
vLLM v0.23.0: Model Runner V2 Default for Llama and Mistral, Transformers v5, Multi-Tier KV Cache
vLLM v0.23.0 (June 15, 408 commits, 200 contributors) makes Model Runner V2 the default for Llama and Mistral dense models, adds Transformers v5 compatibility, multi-tier KV cache offloading with object-store secondary tier, a unified reasoning + tool-call parser, Gemma 4 encoder-free support, and Rust frontend gains including streaming generate and dynamic LoRA. Also includes DeepSeek-V4 production hardening and ROCm 7.2.3 / FlashInfer v0.6.12 updates.
xAI Launches Grok Imagine Video 1.5 to General Availability
xAIxAI moved Grok Imagine Video 1.5 from preview to general availability on June 16, rolling it out on the Imagine API and on grok.com and mobile apps. The model animates still images into 720p/24fps video with native audio. Video 1.5 Fast generates 6-second clips in ~25 seconds (down from 40+ in v1.0), having previously topped the Image-to-Video Arena leaderboard with a 52 Elo point lead.
For reference (4)
ZPPO: Teacher-in-Prompts Knowledge Distillation Outperforms Gradient Methods for Small Reasoners
NVIDIAZone of Proximal Policy Optimization (ZPPO, arXiv 2606.18216) embeds teacher guidance in prompts rather than gradients: it constructs prompts pairing correct teacher responses with incorrect student responses for contrastive learning, and prompts aggregating student errors to surface failure patterns. Tested on 0.8B–9B student models with a 27B teacher, ZPPO outperforms distillation and RL baselines, with strongest gains for smaller models.
Google DeepMind and UK Government Partner to Speed Housing Planning with Gemini
Google DeepMindGoogle DeepMind announced a partnership with the UK government on June 16 to build an AI prototype for planning officers, targeting a 50% reduction in housing application processing time. Built on Gemini, the tool automates data consolidation, policy identification, feedback summarization, and draft report generation. Trials will run in Barnet, Camden, and Dorset councils before a planned national rollout in 2027.
Ollama v0.30.9: Cohere2Moe Support, Coding Agent Single-Token Output Bug Fixed
Ollama v0.30.9 (June 15) adds Cohere2Moe architecture support, fixes the LFM2 parser for cases where thinking was not emitted, and resolves a bug where coding agents invoked via Ollama output only a single token. Also adds an explicit error when a single message exceeds the context window.
llama.cpp June 16 Builds: Eagle3 Speculative Decoding, Vulkan UMA Memory, NVFP4 Fixes
llama.cpp shipped incremental builds b9660–b9672 on June 16. Notable: Eagle3 speculative decoding backend sampling support (b9669), Vulkan preference for host-visible memory on UMA devices (b9668), NVFP4 edge-case fixes in llama-graph (b9670), SYCL support for Q4_K/Q5_K/Q6_K MoE MUL_MAT_ID (b9664), and BoringSSL vendor update to 0.20260616.0 (b9672).