IBM Granite Embedding Multilingual R2: 32K Context and Best Sub-100M Retrieval
IBM
IBM released two new open embedding models: granite-embedding-311m-multilingual-r2 (MTEB Multilingual 65.2) and granite-embedding-97m-multilingual-r2 (60.3, best sub-100M). Both support a 32,768-token context window — 64x more than R1 — 200+ languages, and 9 programming languages. Built on ModernBERT with Flash Attention 2.0. Apache 2.0 license; ONNX/OpenVINO weights included.
Why it matters
32K context closes a critical gap for long-document retrieval in RAG pipelines. The sub-100M model's performance makes on-device embedding feasible without sacrificing quality, and the Apache 2.0 license removes commercial use barriers.
Importance: 3/5
Best-in-class sub-100M multilingual embeddings with 64x context expansion; fully open Apache 2.0