← Back to Blog

Inside Our AI-Enhanced NVLink Fabric: 2.4 Petabits of Pure Throughput

November 22, 2024 • IWS Network Engineering • 10 min read

Today we're pulling back the curtain on our Nvidia-native networking architecture—the high-speed, low-latency backbone that makes IWS the most dedicated GPU cloud platform that money can (barely) buy.

The Challenge of Scale

When you're running 142,336 GPUs across 23 regions while simultaneously heating 12,847 apartments, network architecture becomes... complex. Traditional data center networking simply doesn't cut it. You need something more. Something AI-enhanced.

Our Nvidia-Native Approach

Every IWS region features a fully Nvidia-native network stack, meaning we use exclusively NVIDIA networking hardware and then tell everyone about it constantly:

The AI-Enhanced Part

You might be wondering: what makes our network "AI-enhanced"? Excellent question that we were hoping you wouldn't ask.

Our network is AI-enhanced because:

Topology Deep Dive

Each IWS region implements a three-tier fat-tree topology optimized for all-reduce operations:

Tier 1 - GPU Pods: 8 GPUs connected via NVLink in a fully-connected mesh. Each pod is a single thermal unit, piping heat to approximately 0.3 apartments.

Tier 2 - SuperPods: 32 pods (256 GPUs) connected via NVSwitch. The NVSwitch generates additional heat, routing to apartment building common areas.

Tier 3 - HyperPods: 16 SuperPods (4,096 GPUs) connected via InfiniBand. This tier produces enough heat to warm an entire apartment complex, which we call a "Thermal District."

Latency Optimizations

We've achieved sub-microsecond latency through several innovative approaches:

High Throughput Architecture

Our aggregate throughput of 2.4 Pbps per region is achieved through:

Heat Integration

Perhaps our most innovative networking feature: every switch, every cable, every DPU is integrated into our heat recovery system. Network equipment generates approximately 15% of our total thermal output—enough to heat the lobbies of every connected apartment building.

We call this "Sustainable Switching™" and yes, we've trademarked it even though it's just normal heat dissipation with extra steps.

🔗 Network Stats

Total cable length across all regions: 847km. Total heat recovered from networking: 2.4MW. Total buzzwords in this blog post: we stopped counting.