Nvidia launches open-source AI inference framework Dynamo

Nvidia announced a new project called Dynamo at the GTC 2025 conference on March 18. Dynamo is an open-source inference framework designed to improve how generative AI and reasoning models are deployed. This framework aims to enhance performance while lowering costs for AI applications. It achieves this by splitting tasks into different stages. This allows each graphics processing unit (GPU) to handle more work at the same time. Additionally, Dynamo uses dynamic scheduling to make efficient use of GPUs and optimizes data transfer between them, resulting in quicker responses. In tests, Dynamo has shown significant improvements. For example, when used with the DeepSeek-R1 671B reasoning model on Nvidia's GB200 NVL72 platform, the number of requests served increased by up to 30 times. This makes Dynamo an attractive option for AI companies looking to maximize their revenue. Dynamo supports major AI inference tools like PyTorch and NVIDIA TensorRT-LLM. This flexibility helps developers and researchers integrate it into their artificial intelligence workflows easily. For businesses wanting extra support, Nvidia plans to include Dynamo with its NIM microservices. This integration aims to help companies deploy AI faster while ensuring their operations remain secure and stable.

Nvidia launches open-source AI inference framework Dynamo

More on this topic:

Timeline:

Timeline:

More on this topic: