gpt.buzz
Sign in

Models/DeepSeek

DeepSeek logoDeepSeek-R1

DeepSeek · DeepSeekreleased 2025-01-20open sourceupdated 24 days ago

Reasoning-focused open-weight model.

Specifications

Context window
128,000 tokens
Parameters
671B
Modality
text
License
MIT
Family
DeepSeek
Release date
2025-01-20

Links

Provider status

DeepSeek APIAll Systems Operational
Last incident: 【已恢复】DeepSeek 网页/API不可用([Resolved]DeepSeek Web/API Service Not Available)about 2 months ago
30-day uptime100.00%
View DeepSeek's status page →

Timeline

  1. Released

    Initial public availability.

  2. Pricing changes, lineage updates, and new benchmark results appear here as they happen. See the releases feed for the latest vendor activity.

API pricing

No API pricing recorded yet for DeepSeek-R1.

Looking for consumer subscriptions? See DeepSeek's plans →

Benchmarks

Model Index →
BenchmarkScoreSettingMeasuredSource
AIME 2025

math

79.8%CoT2025-01-20source ↗

Want to see how DeepSeek-R1 ranks across these? Open the Model Index leaderboard →

Infrastructure context

All intelligence →

Compute, silicon, and capex events that shape DeepSeek-R1's economics.

  • Compute cluster

    DeepSeek operates ~50k H800 GPU fleet despite export controls

    DeepSeek's training fleet — believed to be ~50,000 H800 GPUs at peak — was assembled prior to the October 2023 US export controls extension. DeepSeek-V4-Pro's training run reportedly used <8M GPU-hours total, less than 10% of GPT-5's estimated budget, leveraging the hybrid CSA+HCA attention scheme to compress FLOPs.

    Accelerators: 50k · H800
    Location: Hangzhou, China
    SemiAnalysis estimate

Compare DeepSeek-R1 with…

Related news

No tagged articles yet. The aggregator surfaces mentions every 15 minutes.