All models
9 of 20 tracked models · open-source
DeepSeek-V4-Flash
released 2026-04-22
Smaller, faster sibling to DeepSeek-V4-Pro. Same 1M context window with a much lighter 284B / 13B-active MoE.
- Context
- 1,000,000
- Params
- 284B (13B active)
- License
- MIT
- Source
- open
Qwen3.6-27B
released 2026-04-22
Alibaba's first dense open-weight in the 3.6 family. Strong agentic-coding scores (77.2 SWE-bench Verified, matching Claude 4.5 Opus on Terminal-Bench 2.0). Supports 201 languages and multimodal text/image/video input.
- Context
- 262,144
- Params
- 27B (dense)
- License
- Apache-2.0
- Source
- open
DeepSeek-V4-Pro
released 2026-04-22
DeepSeek's flagship open-weight MoE. 1.6T parameters with 49B activated, 1M-token context, and a hybrid attention scheme (CSA + HCA) that delivers long-context inference at ~27% of V3.2's FLOPs.
- Context
- 1,000,000
- Params
- 1.6T (49B active)
- License
- MIT
- Source
- open
DeepSeek-V3.1
released 2025-08-21
Large MoE open-weight model. Predecessor to DeepSeek-V4.
- Context
- 128,000
- Params
- 671B
- License
- MIT
- Source
- open
Qwen3 235B
released 2025-04-29
Predecessor to the Qwen3.6 family.
- Context
- 128,000
- Params
- 235B
- License
- Apache-2.0
- Source
- open
Llama 4 Scout
released 2025-04-05
- Context
- 10,000,000
- Params
- 109B
- License
- Llama 4 Community License
- Source
- open
Llama 4 Maverick
released 2025-04-05
Mixture-of-experts open-weight model from Meta.
- Context
- 1,000,000
- Params
- 400B
- License
- Llama 4 Community License
- Source
- open
DeepSeek-R1
released 2025-01-20
Reasoning-focused open-weight model.
- Context
- 128,000
- Params
- 671B
- License
- MIT
- Source
- open
Mistral Large 2
released 2024-07-24
- Context
- 128,000
- Params
- 123B
- License
- Mistral Research License
- Source
- open