When the Model, Not the Hardware, Defines Performance

Nov 9

For years, GPUs have dictated how AI systems are built. Teams size hardware first, then force models to fit inside fixed GPU constraints. Memory limits, static allocations, over-provisioning, and constant retuning have become normal, even though none of it is model-driven.

Rapt changes that.

What Is MDG™ (Model Defined GPUs)?

MDG™ (Model Defined GPUs) is a new way of thinking about GPU infrastructure. Instead of sizing, allocating, and tuning GPUs in advance, MDG allows the model itself to define how GPU resources are selected, shaped, and orchestrated in real time. The model becomes the source of truth. The GPU adapts. MDG describes an infrastructure paradigm where GPU behavior is determined by the needs of the model, not by static hardware configurations. Introduced and used by Rapt, MDG describes this model-first approach to GPU infrastructure. It reflects how Rapt’s platform observes workloads, understands model behavior, and orchestrates GPU resources accordingly across:

Any GPU
Any cloud
Any on-prem environment
Any model architecture

MDG is not a product SKU or a hardware requirement. It is an operating principle for modern AI infrastructure.

Under an MDG™ approach:

Models are observed as they run
Their real resource requirements are continuously understood
GPU resources are dynamically shaped to match those requirements

Rather than asking, “Which GPU should we rent for this model?”, MDG enables organizations to ask, “What does this model actually need right now?” That distinction matters. MDG treats the model as a living workload with changing characteristics:

Batch sizes fluctuate
Token lengths vary
Memory and compute needs shift over time
Inference and training behave differently under load

It enables GPU infrastructure to respond to those realities dynamically, rather than forcing models into rigid boxes. This is not about buying different GPUs. It is about operating GPUs differently. Rapt is the only agent that can do that.

Why Traditional GPU Infrastructure Breaks Down

Most GPU environments still rely on:

Manual GPU sizing decisions made before deployment
Fixed memory, core, and SM allocations
Trial-and-error tuning during inference and training
Expensive over-provisioning to avoid failures

MDG leads to predictable outcomes:

Idle GPU capacity sitting unused
Models failing under real workloads despite “sufficient” hardware
Slow time to production
Escalating infrastructure costs

The root issue is not the GPU. It is that the GPU is configured without understanding the model. When GPUs are model-defined, organizations can:

Run more AI workloads on the same physical infrastructure
Reduce waste caused by static GPU allocations
Improve inference and training stability by matching resources to actual demand
Move models into production faster by removing manual tuning cycles

MDG makes GPU infrastructure adaptive, workload-aware, and continuously optimized, rather than static and reactive. AI models are becoming more dynamic, more expensive, and more critical to the business. Infrastructure strategies that assume static behavior will continue to break under that pressure. MDG represents a shift toward infrastructure that understands and adapts to the thing that actually matters, the model. The model defines performance. The GPU follows.

Mark Rottensteiner

When the Model, Not the Hardware, Defines Performance

What Is MDG™ (Model Defined GPUs)?

Why Traditional GPU Infrastructure Breaks Down

HMCI and Rapt.ai Announce Deployment of NVIDIA GB10 Systems to Power the Rancho Cordova AI & Robotics Ecosystem

Rapt AI and AMD Collaborate to Enhance AI Workload Management and Inference Performance on AMD Instinct GPUs