gpt.buzz
Sign in

Curated model list

The best AI models for coding in 2026

Coding is the most-measured LLM capability today. Frontier models are stratifying along three dimensions: SWE-bench Verified score, terminal-agent capability, and pricing per million tokens. Here are the ones that actually ship working code.

01

Claude 4.7 Opus

Anthropic

Top SWE-bench Verified — Anthropic's coding-tuned flagship. Best when correctness on long-running multi-file changes matters.

02

GPT-5.5

OpenAI

Strong all-rounder for codegen. Pairs well with the Codex CLI agent.

03

Gemini 3 Pro

Google

Massive context window pays off in monorepo-scale codebases. Best for "read this whole repo and refactor X" prompts.

04

Qwen3.6-27B

Alibaba

Open-weight dense model matching Claude 4.5 Opus on Terminal-Bench 2.0. Self-hostable.

open source
05

DeepSeek-V4-Pro

DeepSeek

1.6T-parameter MoE with 1M context. Open-weight, MIT-licensed — best for self-hosted Agent stacks.

open source
06

Claude 4.6 Sonnet

Anthropic

Cheaper Anthropic option, similar shape to Opus at lower API cost.

Want the rest? Browse the full model catalog, or build a side-by-side comparison.

Other curated lists