moe — AI Digest

30 апр DeepSeek V4: official open-source release with Day-0 adaptation for Huawei Ascend DeepSeek models-llm
2 июн MiniMax Releases M3: Open-Weight Frontier Model with 1M-Token Context and MSA Architecture MiniMax models-llm
8 июн NVIDIA Nemotron 3 Ultra: Open 550B MoE Model Now Available for Agentic Workloads NVIDIA models-llm
10 июн MiniMax M3 Open Weights Released: 1M Context, MoE, Frontier Coding MiniMax models-llm
17 июн Zhipu AI Open-Sources GLM-5.2 Under MIT License with 1M Token Context Zhipu AI models-llm
19 июн Zhipu AI Releases GLM-5.2 Open Weights: 753B MoE with 1M-Token Context under MIT License Zhipu AI / Z.ai models-llm
9 мая Zyphra Releases ZAYA1-8B: Open Reasoning MoE Model Trained on AMD Hardware Zyphra models-llm
20 мая Lance: 3B Unified Multimodal Model for Understanding, Generation, and Editing (314 HF upvotes) ByteDance Research research
4 июн JetBrains Open-Sources Mellum2: 12B MoE Coding Model for Multi-Model Pipelines JetBrains models-llm
10 июн Cohere North Mini Code: 30B Apache-2.0 MoE Coding Model for Agentic Workflows Cohere models-llm
16 июн Kimi K2.7-Code HighSpeed: 6× Throughput for Production Coding Agent Pipelines Moonshot AI models-llm
11 июн Kwai Keye-VL-2.0: Open-Source 30B MoE Multimodal Model with 256K Context for Long Video Kwai research
14 июн Moonshot AI Releases Kimi K2.7-Code: 1T-Parameter Open-Weight Coding Model with Vision Moonshot AI models-llm
14 июн vLLM Adds Day-0 Support for MiniMax M3 Open Weights with 1M-Context Sparse Attention MiniMax tools
14 июн Zhipu AI Releases GLM-5.2: 744B MoE with 1M-Token Context and Coding-First Design Zhipu AI models-llm
29 апр Sber unveils Kandinsky 6.0 Image — flagship image generation model Sber image