Compare models

Pick up to 4models. Specs render side-by-side. Share the URL — it's stateless.

Comparing agents instead? Switch to agent compare →

Selected

Mistral Large 2×

DeepSeek-V4-Flash×

DeepSeek-V4-Pro×

Qwen3 235B×Clear all

	Mistral Large 2 Mistral	DeepSeek-V4-Flash DeepSeek	DeepSeek-V4-Pro DeepSeek	Qwen3 235B Alibaba
Vendor	Mistral	DeepSeek	DeepSeek	Alibaba
Family	Mistral	DeepSeek	DeepSeek	Qwen
Release date	2024-07-24	2026-04-22	2026-04-22	2025-04-29
Context window	128,000 tokens	1,000,000 tokens	1,000,000 tokens	128,000 tokens
Parameters	123B	284B (13B active)	1.6T (49B active)	235B
Modality	text	text	text	text
License	Mistral Research License	MIT	MIT	Apache-2.0
Source	open weights	open weights	open weights	open weights
Description	Large text-only Mistral model with a 128K context window and 123B parameters, tuned for strong instruction following and long-context reasoning. Mistral's flagship open-weights release in the Large 2 line.	Smaller, faster sibling to DeepSeek-V4-Pro. Same 1M context window with a much lighter 284B / 13B-active MoE.	DeepSeek's flagship open-weight MoE. 1.6T parameters with 49B activated, 1M-token context, and a hybrid attention scheme (CSA + HCA) that delivers long-context inference at ~27% of V3.2's FLOPs.	Predecessor to the Qwen3.6 family.
Links	vendor →weights →	weights →	weights →	weights →
Benchmarks
MMLU-Pro	—	—	84.2%	—
GPQA-D	—	—	82.4%	—
HumanEval	89.0%	—	95.1%	—
Aider	—	—	80.1%	—
AIME-25	—	—	88.6%	—
LiveCB	—	—	72.4%	—