Models/Vendor

DeepSeek

www.deepseek.com →status page →4 models · 0 agents

Infrastructure intelligence

Full feed →

Compute deals, data centers, silicon, and capex that shape DeepSeek's training and inference economics.

Compute clusterApr 22, 2026reported
DeepSeek operates ~50k H800 GPU fleet despite export controls
DeepSeek's training fleet — believed to be ~50,000 H800 GPUs at peak — was assembled prior to the October 2023 US export controls extension. DeepSeek-V4-Pro's training run reportedly used <8M GPU-hours total, less than 10% of GPT-5's estimated budget, leveraging the hybrid CSA+HCA attention scheme to compress FLOPs.
Accelerators: 50k · H800
Location: Hangzhou, China
SemiAnalysis estimate ↗

Models

Filter on /models →

DeepSeek-V4-Flash

DeepSeek

released 2026-04-22

Smaller, faster sibling to DeepSeek-V4-Pro. Same 1M context window with a much lighter 284B / 13B-active MoE.

Context: 1,000,000
Params: 284B (13B active)
License: MIT
Source: open

DeepSeek-V4-Pro

DeepSeek

released 2026-04-22

DeepSeek's flagship open-weight MoE. 1.6T parameters with 49B activated, 1M-token context, and a hybrid attention scheme (CSA + HCA) that delivers long-context inference at ~27% of V3.2's FLOPs.

Context: 1,000,000
Params: 1.6T (49B active)
License: MIT
Source: open

DeepSeek-V3.1

DeepSeek

released 2025-08-21

Large MoE open-weight model. Predecessor to DeepSeek-V4.

Context: 128,000
Params: 671B
License: MIT
Source: open

DeepSeek-R1

DeepSeek

released 2025-01-20

Reasoning-focused open-weight model.

Context: 128,000
Params: 671B
License: MIT
Source: open

Recent news

No tagged articles yet.