Compare models

Pick up to 4models. Specs render side-by-side. Share the URL — it's stateless.

Comparing agents instead? Switch to agent compare →

Selected

Qwen3.7-Max×

Gemini Omni×

Mistral Large 2×

DeepSeek-V4-Flash×Clear all

	Qwen3.7-Max Alibaba	Gemini Omni Google	Mistral Large 2 Mistral	DeepSeek-V4-Flash DeepSeek
Vendor	Alibaba	Google	Mistral	DeepSeek
Family	Qwen	Gemini	Mistral	DeepSeek
Release date	2026-05-20	2026-05-20	2024-07-24	2026-04-22
Context window	1,000,000 tokens	1,000,000 tokens	128,000 tokens	1,000,000 tokens
Parameters	—	—	123B	284B (13B active)
Modality	text	text, vision, audio, video	text	text
License	proprietary	proprietary	Mistral Research License	MIT
Source	proprietary	proprietary	open weights	open weights
Description	Alibaba's flagship agent model — 1M-token context, extended-thinking mode, 56.6 on the Artificial Analysis Intelligence Index v4.0 (5th overall, #1 Chinese). 50.8% on Terminal-Bench Hard. Designed for long-horizon agent workloads (hundreds-to-thousands of steps). Closed-weight, $2.50/$7.50 per 1M tokens.	Multimodal Gemini variant introduced at Google I/O 2026 — unified text, image, audio, and video processing in a single model.	Large text-only Mistral model with a 128K context window and 123B parameters, tuned for strong instruction following and long-context reasoning. Mistral's flagship open-weights release in the Large 2 line.	Smaller, faster sibling to DeepSeek-V4-Pro. Same 1M context window with a much lighter 284B / 13B-active MoE.
Links	vendor →	vendor →	vendor →weights →	weights →
Benchmarks
MMLU-Pro	83.7%	—	—	—
GPQA-D	83.0%	—	—	—
HumanEval	93.9%	—	89.0%	—
Aider	78.4%	—	—	—
AIME-25	90.4%	—	—	—
LiveCB	71.0%	—	—	—
MMMU	—	82.5%	—	—