DeepSeek-R1
Reasoning-focused open-weight model.
Specifications
- Context window
- 128,000 tokens
- Parameters
- 671B
- Modality
- text
- License
- MIT
- Family
- DeepSeek
- Release date
- 2025-01-20
Links
Provider status
Timeline
Released
Initial public availability.
Pricing changes, lineage updates, and new benchmark results appear here as they happen. See the releases feed for the latest vendor activity.
API pricing
No API pricing recorded yet for DeepSeek-R1.
Looking for consumer subscriptions? See DeepSeek's plans →
Benchmarks
Model Index →Want to see how DeepSeek-R1 ranks across these? Open the Model Index leaderboard →
Infrastructure context
All intelligence →Compute, silicon, and capex events that shape DeepSeek-R1's economics.
- Compute cluster
DeepSeek operates ~50k H800 GPU fleet despite export controls
DeepSeek's training fleet — believed to be ~50,000 H800 GPUs at peak — was assembled prior to the October 2023 US export controls extension. DeepSeek-V4-Pro's training run reportedly used <8M GPU-hours total, less than 10% of GPT-5's estimated budget, leveraging the hybrid CSA+HCA attention scheme to compress FLOPs.
Accelerators: 50k · H800Location: Hangzhou, China
Compare DeepSeek-R1 with…
Related news
No tagged articles yet. The aggregator surfaces mentions every 15 minutes.