Compare models

Pick up to 4models. Specs render side-by-side. Share the URL — it's stateless.

Comparing agents instead? Switch to agent compare →

Selected

DeepSeek-V4-Pro×

DeepSeek-R1×

DeepSeek-V3.1×

Mistral Large 2×Clear all

	DeepSeek-V4-Pro DeepSeek	DeepSeek-R1 DeepSeek	DeepSeek-V3.1 DeepSeek	Mistral Large 2 Mistral
Vendor	DeepSeek	DeepSeek	DeepSeek	Mistral
Family	DeepSeek	DeepSeek	DeepSeek	Mistral
Release date	2026-04-22	2025-01-20	2025-08-21	2024-07-24
Context window	1,000,000 tokens	128,000 tokens	128,000 tokens	128,000 tokens
Parameters	1.6T (49B active)	671B	671B	123B
Modality	text	text	text	text
License	MIT	MIT	MIT	Mistral Research License
Source	open weights	open weights	open weights	open weights
Description	DeepSeek's flagship open-weight MoE. 1.6T parameters with 49B activated, 1M-token context, and a hybrid attention scheme (CSA + HCA) that delivers long-context inference at ~27% of V3.2's FLOPs.	Reasoning-focused open-weight model.	Large MoE open-weight model. Predecessor to DeepSeek-V4.	Large text-only Mistral model with a 128K context window and 123B parameters, tuned for strong instruction following and long-context reasoning. Mistral's flagship open-weights release in the Large 2 line.
Links	weights →	weights →	weights →	vendor →weights →
Benchmarks
MMLU-Pro	84.2%	—	—	—
GPQA-D	82.4%	—	—	—
HumanEval	95.1%	—	—	89.0%
Aider	80.1%	—	—	—
AIME-25	88.6%	79.8%	—	—
LiveCB	72.4%	—	—	—