Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality
IBM has released Granite Embedding Multilingual R2, an open-source multilingual embedding model with a 32,000-token context window. This model, available under Apache 2.0 license, achieves state-of-the-art retrieval quality for models under 100 million parameters.
IBM has unveiled Granite Embedding Multilingual R2, an open-source multilingual embedding model designed to handle a wide array of languages with remarkable efficiency.
This model is distinguished by its expansive 32,000-token context window, allowing it to process and understand significantly larger chunks of text. This capability is crucial for accurately capturing nuanced meanings across diverse linguistic contexts.
Released under the Apache 2.0 license, Granite R2 is freely accessible, promoting broader adoption and collaborative development within the AI community. This open-source approach aligns with IBM's commitment to fostering innovation.
Granite R2 sets a new benchmark for retrieval quality among models with fewer than 100 million parameters. Its superior performance makes it an ideal solution for applications requiring high-precision information retrieval in multilingual environments.
Related articles
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
CUGA, IBM's open-source Agent Harness, simplifies building agentic applications by handling infrastructure, allowing developers to focus on tools and prompts. It offers pre-assembled components for planning, execution, and state management, significantly reducing development time. CUGA has topped agent benchmarks like AppWorld and WebArena.
OpenAI launches new initiative to help find and patch open source bugs
OpenAI has launched "Patch the Planet," a new initiative in partnership with cybersecurity firm Trail of Bits, to enhance the security of open-source projects. This program aims to assist maintainers in identifying and patching bugs, utilizing OpenAI's AI-powered security tools while reducing the burden on project teams.
PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M Parameters
Baidu has released PP-OCRv6, an advanced optical character recognition (OCR) model supporting 50 languages. Available on Hugging Face, this version significantly improves accuracy and efficiency across various parameter sizes, from 1.5 million to 34.5 million, marking a substantial leap in multilingual OCR technology.
