Infrastructure intelligence

The hardware story behind frontier AI

Compute clusters, data centers, GPU supply, power deals, and capex — the physical layer that decides which labs can train tomorrow's frontier and at what unit cost. Editorial briefs synthesize the data; signals below are the atomic facts they're built on.

Editorial briefs

2 published

Brief · May 20, 2026

The CoWoS bottleneck: why GPU supply still gates 2026 model roadmaps

Every Blackwell, Ironwood, and Trainium2 chip flows through one TSMC packaging line. Capacity doubled in 2026 — and every frontier lab is still constrained.

supply-chaintsmcpackagingnvidia

Brief · May 15, 2026

Anthropic's two-cloud bet: why Trainium plus TPU changes the math

Custom silicon on two hyperscalers gives Anthropic the lowest unit cost in the frontier — and the biggest single-vendor risk.

anthropicsilicon-economicsawsgoogle

Signals

4 events

All Compute cluster Data center Silicon GPU supply Power Capex Partnership Training run

Google xAI DeepSeek OpenAI Anthropic Mistral Meta Alibaba

Compute clusterreportedApr 22, 2026
DeepSeek operates ~50k H800 GPU fleet despite export controls
DeepSeek's training fleet — believed to be ~50,000 H800 GPUs at peak — was assembled prior to the October 2023 US export controls extension. DeepSeek-V4-Pro's training run reportedly used <8M GPU-hours total, less than 10% of GPT-5's estimated budget, leveraging the hybrid CSA+HCA attention scheme to compress FLOPs.
Accelerators
50k · H800
Location
Hangzhou, China
DeepSeek DeepSeek-V4-Pro SemiAnalysis estimate ↗
Compute clusterreportedMar 10, 2026
xAI Colossus 2: targeting 1M GPUs across Memphis + new Mississippi site
Colossus 2 — xAI's expansion target — aims for 550k Blackwell-class GPUs (GB200/GB300) in 2026, scaling toward 1M total accelerators by year-end. A second 2GW campus in DeSoto County, MS, is under construction to host the bulk of the buildout. Power deals announced with the Tennessee Valley Authority and Mississippi Power.
Accelerators
1M · GB200/GB300
Power
2 GW
Location
Memphis, TN + DeSoto County, MS
xAI Grok 4 Bloomberg / company filings ↗
Compute clusterreportedDec 3, 2025
Anthropic's Project Rainier: 400k Trainium2 chips across AWS multi-region
Anthropic's primary training cluster — codenamed Project Rainier — runs on a multi-region Trainium2 fleet AWS built specifically for them. By end of 2025 the configuration was disclosed at roughly 400,000 Trainium2 chips spanning sites in Indiana, Wyoming, and Mississippi. Trainium2's economics are central to Anthropic's ability to sell Sonnet at ~40% the price of equivalent-tier rivals.
Accelerators
400k · Trainium2
Location
St Joseph County, IN + Wyoming + Mississippi
Anthropic Claude 4.7 Opus AWS re:Invent / Anthropic announcement ↗
Compute clusterverifiedFeb 17, 2025
xAI Colossus: 200k H100 GPUs in 122 days at the Memphis ex-Electrolux site
xAI brought up its first Memphis training cluster, "Colossus", in 122 days — a build cadence unprecedented in the industry. The initial buildout was 100k H100s; xAI doubled it to 200k by early 2025 by adding H200s. Power was bridged with on-site mobile gas turbines while Tennessee Valley Authority capacity caught up. Colossus trained Grok 3 and is currently training Grok 4 successors.
Accelerators
200k · H100/H200
Power
150 MW
Location
Memphis, TN
xAI Grok 4 xAI / Nvidia announcement ↗

Editorial briefs

The CoWoS bottleneck: why GPU supply still gates 2026 model roadmaps

Anthropic's two-cloud bet: why Trainium plus TPU changes the math

Signals

DeepSeek operates ~50k H800 GPU fleet despite export controls

xAI Colossus 2: targeting 1M GPUs across Memphis + new Mississippi site

Anthropic's Project Rainier: 400k Trainium2 chips across AWS multi-region

xAI Colossus: 200k H100 GPUs in 122 days at the Memphis ex-Electrolux site