IBL News | New York
NVIDIA and French startup Mistral AI released Mistral NeMo 12B this week, a new open-source LLM with 12 billion parameters and 128,000 token context window.
It is intended for developers who customize and deploy enterprise applications supporting chatbots, multilingual tasks, coding, and summarization without extensive cloud resources.
The open model license, Apache 2.0, allows enterprises to integrate Mistral NeMo into commercial applications seamlessly.
“We have developed a model with unprecedented accuracy, flexibility, high efficiency, and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment,” said Guillaume Lample, cofounder and chief scientist of Mistral AI.
Mistral NeMo is trained on the NVIDIA DGX Cloud AI platform and comes packaged as an NVIDIA NIM inference microservice. To advance and optimize the process, NVIDIA TensorRT-LLM for accelerated inference performance and the NVIDIA NeMo development platform was also used.
Designed to fit on the memory of a single NVIDIA L40S, NVIDIA GeForce RTX 4090, or NVIDIA RTX 4500 GPU, the Mistral NeMo NIM offers high efficiency, low compute cost, and enhanced security and privacy.