Daily digest
4 items · ~4 min · Week 2026-W25
Worth knowing (1)
GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents
GateMem is a benchmark evaluating LLM agents deployed in multi-user institutional settings (hospitals, offices, schools) on three competing goals: utility for legitimate requests, role-based access control, and reliable data deletion. Testing across all current methods reveals none simultaneously achieve all three properties, exposing a critical gap before real institutional deployment.
For reference (3)
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence in VLMs
Nanyang Technological UniversityS-Agent reframes spatial reasoning in vision-language models as an agentic process: a VLM planner dispatches spatial tools to accumulate evidence across 2D-to-3D projections and time, maintaining scene and agent memory across frames. The approach is training-free for existing models, and a fine-tuned S-Agent-8B matches closed-source models on spatial benchmarks.
llama.cpp b9754: Real-Time Model Load Progress via SSE and PEG Grammar Parser
llama.cpp shipped ~12 tagged builds on June 21, 2026 (b9743–b9754). Key additions: b9747 adds real-time model load progress tracking via /models/sse (Server-Sent Events); b9750 implements the Jinja call statement for template generation; b9754 adds an automaton-based PEG parser for stricter grammar-constrained generation. All builds ship cross-platform binaries for macOS, Linux, Windows, and Android.
Yandex Adds 30 AI Characters with Distinct Personalities to Alice AI Chat
YandexYandex launched over 30 AI-powered characters with distinct personalities inside the Alice AI chat interface, ranging from bloggers to anime characters, each designed for specific use cases such as emotional support, self-development, or entertainment. Users can also create custom characters by specifying a name and behavior description; characters retain conversation history across sessions and are available on alice.yandex.ru, iOS/Android apps, and Yandex Browser.