training — AI Digest

11 мая Mean Mode Screaming: Training Pathology Fix Enables 1000-Layer Diffusion Transformers research
24 июн Prime Intellect Releases prime-rl v0.6.0 for Agentic RL on Trillion-Parameter MoE Models Prime Intellect research
8 мая Model Spec Midtraining: How Normative Self-Knowledge Improves Alignment Generalization Anthropic research
25 июн Quantized Reasoning Models Think They Need to Think Longer, but They Do Not Meta research
3 июн TrOPD: Trust-Region On-Policy Distillation Stabilizes LLM Training When Teacher-Student Gap Is Large Samsung Research research
12 июн FORT-Searcher: Shortcut-Resistant Training Data Framework for Deep Search Agents research
25 июн DomainShuttle: Subject-Driven Text-to-Video Across In-Domain and Cross-Domain Scenarios research
3 июн QUBRIC: Co-Designing Queries and Rubrics Extends RLVR to Open-Ended Reasoning Domains research
17 июн ZPPO: Teacher-in-Prompts Knowledge Distillation Outperforms Gradient Methods for Small Reasoners NVIDIA research