#scalable-oversight 2 items 8 мая Automated Weak-to-Strong Researcher: AI Agents Outperform Humans on Alignment Research Anthropic research 9 июн Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight Rutgers University research