Nvidia Unveils NIM- A New Tool for Effortless Deployment of AI Models into Production

Nvidia made a significant announcement at its recent GTC conference: they are launching Nvidia NIM, a software platform built to facilitate the integration of custom and pre-trained AI models into production landscapes. The new platform harnesses and simplifies the software work Nvidia has been doing around inferencing and optimizing models. By combining a model with an optimized inferencing engine and packing everything into a container, they’ve made it easily accessible as a microservice.

The process to create and deploy these types of containers can take developers several weeks to months, according to Nvidia. And that’s assuming a company even has the in-house AI talent to do so. With NIM, Nvidia has the intention of fostering an ecosystem of AI-ready containers that lean on its hardware as a base layer. These containers would then use these curated microservices as a key software layer for businesses looking to accelerate their AI initiatives.

In its current state, NIM supports models from Nvidia, A121, Adept, Cohere, Getty Images, and Shutterstock, as well as open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI. In a bid to make these NIM microservices accessible, Nvidia is partnering with industry giants Amazon, Google, and Microsoft to bring NIM to platforms like SageMaker, Kubernetes Engine, and Azure AI. Plans are also in place to integrate these into frameworks such as Deepset, LangChain, and LlamaIndex.

Nvidia is the source of the image credit.

“We believe that the Nvidia GPU is the best place to run inference of these models on […], and we believe that NVIDIA NIM is the best software package, the best runtime, for developers to build on top of so that they can focus on one enterprise applications — and just let Nvidia do the work to produce these models for them in the most efficient, enterprise-grade manner, so that they can just do the rest of their work,” said Manuvir Das, the head of enterprise computing at Nvidia, during a press conference ahead of today’s announcements.”

In regard to the inference engine, Nvidia plans to utilize the Triton Inference Server, TensorRT and TensorRT-LLM. Several Nvidia microservices that will be accessible via NIM include Riva for customizing speech and translation models, cuOpt for improving routing, and the Earth-2 model for weather and climate simulations.

Nvidia will continue to enhance its capabilities over time, such as making the Nvidia RAG LLM operator available via NIM. This is expected to simplify the creation of AI chatbots that can pull in custom data.

A developer conference would not be complete without announcing customer and partner benefits. Some of the companies currently making use of NIM include Cloudera, NetApp, Cohesity, Box, DropBox, and Datastax.

“Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots,” said Jensen Huang, founder and CEO of NVIDIA. “Created with our partner ecosystem, these containerized AI microservices are the building blocks for enterprises in every industry to become AI companies.”

Discover the pinnacle of WordPress auto blogging technology with AutomationTools.AI. Harnessing the power of cutting-edge AI algorithms, AutomationTools.AI emerges as the foremost solution for effortlessly curating content from RSS feeds directly to your WordPress platform. Say goodbye to manual content curation and hello to seamless automation, as this innovative tool streamlines the process, saving you time and effort. Stay ahead of the curve in content management and elevate your WordPress website with AutomationTools.AI—the ultimate choice for efficient, dynamic, and hassle-free auto blogging. Learn More

Leave a Reply

Your email address will not be published. Required fields are marked *