NVIDIA Unveils the Nemotron 3 Ultra AI Model, Which Enables Agents to Perform at a Lower Cost
June 2, 2026

IBL News | New York
NVIDIA unveiled the new open-source Nemotron 3 Ultra this week, a 550 billion-parameter mixture-of-experts AI model for enterprise workflows, coding, and research, with “up to 5x faster inference and up to 30% lower cost than open frontier models in its class,” according to the company.
The chip company said the model will be released on June 4 on Hugging Face, ModelScope, OpenRouter, and build.nvidia.com as NVIDIA NIM microservices, as well as through a broad ecosystem of NVIDIA Cloud Partners, inference platforms, and cloud service providers.
The verified NVIDIA agent skills are available in the Claude Code plug-in marketplace and the Hermes Skills Hub. NVIDIA also released a major collection of open-source physical AI libraries, skills, models, and frameworks, enabling AI agents and developers to stand up workflows that accelerate the development of robotics, autonomous vehicles, and industrial systems.
The Nemotron 3 Ultra models work with several orchestration frameworks for deploying and coordinating agents, including Hermes Agent, LangChain Deep Agents, OpenClaw, OpenHands, and OpenCode.
These new models and datasets for always-on agents are developed in collaboration with the NVIDIA Nemotron Coalition.
In the Artificial Intelligence ranking, Nemotron 3 Ultra scores 48 points, well ahead of other open U.S. models such as Gemma 4 31B (39), Nemotron 3 Super (36), and gpt-oss-120b (33). It doesn’t reach the top open models from China, though. Kimi K2.6 scores 54 points there. The current strongest closed model, Opus 4.8, hits 61 points.
On provider DeepInfra, Nemotron 3 Ultra also delivers more than 300 tokens per second, according to Artificial Analysis. Comparably sized models from DeepSeek or Moonshot currently manage only 50 to 100.
Firms like CrowdStrike and Palantir also use these kinds of agents to process complex data, coordinate tasks, and streamline operations across cybersecurity and enterprise environments.
- “CrowdStrike is using NVIDIA Nemotron models for its specialized agents that continuously identify, prioritize, and remediate vulnerabilities and policy misconfigurations, helping stop adversaries faster while reducing the operational burden on security teams.”
- “Palantir is integrating NVIDIA Nemotron models into its AI FDE (Forward Deployed Engineer) platform to autonomously execute complex tasks, enabling continuous learning from agent interactions to build domain-specific, air-gapped enterprise systems.”
Autonomous agents that write code, generate sub-agents, and remember context across sessions can access local files, learn new tools, and execute advanced workflows with increasing independence. The more capable agents become, the more important it is to have necessary guardrails for the agents to operate within. The critical layer is a runtime with adjustable privacy and security controls that make autonomous agents safer to deploy at scale.
Canonical will integrate OpenShell with Ubuntu via supported snaps and rocks (aka OCI-compliant containers) to run autonomous agents on enterprise servers worldwide.
Red Hat is integrating OpenShell into its full-stack Red Hat AI platform to maintain infrastructure-level oversight and policy. The company is also making key contributions to the OpenShell upstream open-source project to help standardize the management of agents on enterprise platforms.
Yesterday’s announcements build on recent integrations by SAP, which is embedding OpenShell into Joule Studio runtime — part of SAP Business AI Platform for enterprise AI agents — and ServiceNow, which secured Project Arc, ServiceNow’s enterprise autonomous desktop agent, with OpenShell to add policy-based management for enterprise safety.
OpenShell runs in on-premises, hybrid, and enterprise cloud environments, local devices such as NVIDIA RTX Spark, NVIDIA DGX Spark™, and GB10 systems from system providers, as well as NVIDIA DGX Station™ for Windows and NVIDIA DGX Station GB300 systems from NVIDIA partners.
• Huang’s keynote at NVIDIA GTC Taipei.
Discover more
IBL News is funded by the New York-based, family-owned company ibl.ai. Our stories adhere to the highest ethical standards in journalism and are available to news syndication agencies.








