Kimi K2.7-Code HighSpeed: 6× Throughput for Production Coding Agent Pipelines

Moonshot AI

Models / LLM official + media 4 src. ~1 min

On June 15, 2026, Moonshot AI announced a HighSpeed variant of Kimi K2.7-Code, rolling out to Kimi Code Beta and Kimi Business users. The HighSpeed mode delivers approximately 180 tokens/second on median-length coding inputs and up to 260 tokens/second on shorter tasks — roughly six times faster than the standard release. The base K2.7-Code (1 trillion-parameter MoE, 32B active, 256K context) shipped on June 12, reporting +21.8% on Kimi Code Bench v2 and approximately 30% fewer reasoning tokens over K2.6.

Why it matters

At ~$0.95/M input tokens with open weights available for self-hosting, Kimi K2.7-Code HighSpeed directly targets the throughput bottleneck in production coding-agent pipelines — where token-generation speed limits the number of iterations an agent can run per unit time.

Importance: 3/5

Significant throughput improvement (6×) for a top-tier open-weight coding model at sub-dollar pricing; direct relevance for agentic software engineering workloads.

kimi moonshot-ai coding moe open-weights china update inference

Kimi K2.7-Code HighSpeed: 6× Throughput for Production Coding Agent Pipelines

Why it matters

Related items

Sources