IBM Granite Embedding Multilingual R2: 32K Context and Best Sub-100M Retrieval

IBM

Tools official 1 src. ~1 min

IBM released two new open embedding models: granite-embedding-311m-multilingual-r2 (MTEB Multilingual 65.2) and granite-embedding-97m-multilingual-r2 (60.3, best sub-100M). Both support a 32,768-token context window — 64x more than R1 — 200+ languages, and 9 programming languages. Built on ModernBERT with Flash Attention 2.0. Apache 2.0 license; ONNX/OpenVINO weights included.

Why it matters

32K context closes a critical gap for long-document retrieval in RAG pipelines. The sub-100M model's performance makes on-device embedding feasible without sacrificing quality, and the Apache 2.0 license removes commercial use barriers.

Importance: 3/5

Best-in-class sub-100M multilingual embeddings with 64x context expansion; fully open Apache 2.0

embeddings multilingual rag open-source

Sources

official Granite Embedding Multilingual R2 — HuggingFace Blog