Compare models

Pick up to 4models. Specs render side-by-side. Share the URL — it's stateless.

Comparing agents instead? Switch to agent compare →

Selected

DeepSeek-V4-Pro×

DeepSeek-V4-Flash×

Mistral Large 2×

DeepSeek-V3.1×Clear all

	DeepSeek-V4-Pro DeepSeek	DeepSeek-V4-Flash DeepSeek	Mistral Large 2 Mistral	DeepSeek-V3.1 DeepSeek
Vendor	DeepSeek	DeepSeek	Mistral	DeepSeek
Family	DeepSeek	DeepSeek	Mistral	DeepSeek
Release date	2026-04-22	2026-04-22	2024-07-24	2025-08-21
Context window	1,000,000 tokens	1,000,000 tokens	128,000 tokens	128,000 tokens
Parameters	1.6T (49B active)	284B (13B active)	123B	671B
Modality	text	text	text	text
License	MIT	MIT	Mistral Research License	MIT
Source	open weights	open weights	open weights	open weights
Description	DeepSeek's flagship open-weight MoE. 1.6T parameters with 49B activated, 1M-token context, and a hybrid attention scheme (CSA + HCA) that delivers long-context inference at ~27% of V3.2's FLOPs.	Smaller, faster sibling to DeepSeek-V4-Pro. Same 1M context window with a much lighter 284B / 13B-active MoE.	Large text-only Mistral model with a 128K context window and 123B parameters, tuned for strong instruction following and long-context reasoning. Mistral's flagship open-weights release in the Large 2 line.	Large MoE open-weight model. Predecessor to DeepSeek-V4.
Links	weights →	weights →	vendor →weights →	weights →
Benchmarks
MMLU-Pro	84.2%	—	—	—
GPQA-D	82.4%	—	—	—
HumanEval	95.1%	—	89.0%	—
Aider	80.1%	—	—	—
AIME-25	88.6%	—	—	—
LiveCB	72.4%	—	—	—