#gpu
- SGLang v0.5.11: Speculative Decoding V2 as Default and Eight New Model Architectures tools
- vLLM v0.21.0: Blackwell MLA Backend, HMA KV Offload, Spec Decode for Reasoning Models vLLM Project tools
- vLLM v0.22.0: DeepSeek V4 Production Hardening, Rust Frontend, 28.9% Latency Drop tools
- vLLM v0.20.1: DeepSeek V4 Stabilization on CUDA 13 and PyTorch 2.11 tools