2026 LLM Comparison Dataset: 50+ Models with Specs & Pricing
Reviewed: June 4, 2026
Last updated: May 2026 | Try the Interactive Model Finder Tool
This comprehensive dataset covers 50+ large language models available in 2026, including specs, pricing, benchmarks, and deployment options. Use the interactive tool to filter and compare models.
Frontier Proprietary Models
| Model | Provider | Parameters | Context | Input $/1M | Output $/1M | Modalities | MMLU | License |
|---|---|---|---|---|---|---|---|---|
| GPT-4o | OpenAI | ~1.8T (est.) | 128K | $2.50 | $10.00 | Text, Image | 88.7% | Proprietary |
| GPT-4o-mini | OpenAI | ~8B (est.) | 128K | $0.15 | $0.60 | Text, Image | 82.0% | Proprietary |
| GPT-4.5 | OpenAI | ~5T (est.) | 128K | $75.00 | $150.00 | Text, Image | 91.2% | Proprietary |
| o3 | OpenAI | Undisclosed | 200K | $2.00 | $8.00 | Text, Image | 89.1% | Proprietary |
| o4-mini | OpenAI | Undisclosed | 200K | $1.10 | $4.40 | Text, Image | 85.3% | Proprietary |
| Claude 3.5 Sonnet | Anthropic | ~100B (est.) | 200K | $3.00 | $15.00 | Text, Image | 88.3% | Proprietary |
| Claude 3.5 Haiku | Anthropic | ~20B (est.) | 200K | $0.25 | $1.25 | Text, Image | 75.2% | Proprietary |
| Claude 3.7 Sonnet | Anthropic | ~200B (est.) | 200K | $3.00 | $15.00 | Text, Image | 90.8% | Proprietary |
| Claude 3 Opus | Anthropic | ~500B (est.) | 200K | $15.00 | $75.00 | Text, Image | 86.8% | Proprietary |
| Gemini 2.0 Flash | ~100B (est.) | 1M | $0.10 | $0.40 | Text, Image, Audio, Video | 86.5% | Proprietary | |
| Gemini 2.0 Pro | ~500B (est.) | 1M | $1.25 | $5.00 | Text, Image, Audio, Video | 89.8% | Proprietary | |
| Gemini 1.5 Pro | ~1.5T (est.) | 1M | $1.25 | $5.00 | Text, Image, Audio, Video | 85.9% | Proprietary | |
| Grok-2 | xAI | ~300B (est.) | 128K | $2.00 | $10.00 | Text, Image | 87.5% | Proprietary |
| Grok-3 | xAI | ~1T (est.) | 128K | $3.00 | $15.00 | Text, Image | 90.1% | Proprietary |
| Command R+ | Cohere | ~104B | 128K | $2.50 | $10.00 | Text | 75.7% | Proprietary |
| Command A | Cohere | ~111B | 256K | $2.50 | $10.00 | Text | 78.2% | Proprietary |
| Nova Micro | Amazon | ~13B (est.) | 128K | $0.04 | $0.14 | Text | 73.8% | Proprietary |
| Nova Lite | Amazon | ~30B (est.) | 300K | $0.06 | $0.24 | Text, Image | 78.5% | Proprietary |
| Nova Pro | Amazon | ~100B (est.) | 300K | $0.80 | $3.20 | Text, Image | 84.2% | Proprietary |
| Jamba 1.5 Large | AI21 | 398B (MoE) | 256K | $2.00 | $8.00 | Text | 81.2% | Proprietary |
| Mistral Large 2 | Mistral | ~123B | 128K | $2.00 | $6.00 | Text | 84.0% | Proprietary |
| Pixtral Large | Mistral | ~124B | 128K | $2.00 | $6.00 | Text, Image | 83.5% | Proprietary |
| Groq Llama 3.3 | Groq/Meta | 70B | 128K | $0.59 | $0.79 | Text | 86.8% | Proprietary |
Open-Source Models
| Model | Provider | Parameters | Context | MMLU | Modalities | License | GPU Min |
|---|---|---|---|---|---|---|---|
| Llama 4 Scout | Meta | 109B (16 experts) | 10M | 87.2% | Text, Image | Llama 4 | 2x A100 |
| Llama 4 Maverick | Meta | 400B (128 experts) | 1M | 89.5% | Text, Image | Llama 4 | 8x H100 |
| Llama 3.3 70B | Meta | 70B | 128K | 86.8% | Text | Llama 3.3 | 2x A100 |
| Llama 3.1 405B | Meta | 405B | 128K | 88.6% | Text | Llama 3.1 | 8x H100 |
| Qwen 3 235B | Alibaba | 235B (MoE) | 128K | 89.1% | Text | Apache 2.0 | 4x H100 |
| Qwen 2.5 72B | Alibaba | 72B | 128K | 86.1% | Text | Qwen License | 2x A100 |
| Qwen-VL-Max | Alibaba | ~7B | 32K | 78.5% | Text, Image | Qwen License | 1x A10G |
| DeepSeek V3 | DeepSeek | 671B (MoE) | 128K | 88.5% | Text | MIT | 8x H100 |
| DeepSeek R1 | DeepSeek | 671B (MoE) | 128K | 90.2% | Text | MIT | 8x H100 |
| DeepSeek-V2 | DeepSeek | 236B (MoE) | 128K | 84.0% | Text | MIT | 4x A100 |
| Yi-Large 2 | 01.AI | ~34B | 32K | 82.5% | Text | Apache 2.0 | 2x A100 |
| Mixtral 8x22B | Mistral | 141B (MoE) | 64K | 82.3% | Text | Apache 2.0 | 4x A100 |
| DBRX | Databricks | 132B (MoE) | 32K | 78.5% | Text | Databricks | 4x A100 |
| Command R+ | Cohere | 104B | 128K | 75.7% | Text | CC-BY-NC | 4x A100 |
| Gemma 3 27B | 27B | 128K | 78.2% | Text, Image | Gemma | 1x A100 | |
| Gemma 3 12B | 12B | 128K | 72.5% | Text, Image | Gemma | 1x A10G | |
| Phi-4 | Microsoft | 14B | 128K | 78.9% | Text | MIT | 1x A10G |
| Phi-3-Vision | Microsoft | 4.2B | 128K | 68.5% | Text, Image | MIT | 1x RTX 4080 |
| InternVL-2 76B | Shanghai AI Lab | 76B | 32K | 82.1% | Text, Image, Video | MIT | 2x A100 |
| LLaVA-1.6 34B | Berkeley | 34B | 4K | 75.3% | Text, Image | Apache 2.0 | 1x A100 |
| Molmo 72B | AI2 | 72B | 4K | 80.5% | Text, Image | Apache 2.0 | 2x A100 |
| Idefics3 8B | Hugging Face | 8B | 8K | 65.2% | Text, Image | Apache 2.0 | 1x RTX 4080 |
| Falcon 180B | TII | 180B | 2K | 70.4% | Text | TII Falcon | 4x A100 |
| Jamba Large | AI21 | 398B | 256K | 81.2% | Text | Apache 2.0 | 4x A100 |
| OLMo 2 7B | AI2 | 7B | 4K | 68.5% | Text | Apache 2.0 | 1x A10G |
| Nemotron 4 340B | NVIDIA | 340B | 128K | 87.8% | Text | NVIDIA Open | 8x H100 |
| Mistral NeMo | Mistral/NVIDIA | 12B | 128K | 72.1% | Text | Apache 2.0 | 1x A10G |
Key Trends in 2026
- MoE dominates: Mixture-of-Experts architectures (DeepSeek, Llama 4, Qwen 3) deliver frontier quality at 1/3 the inference cost
- Open-source catching up: DeepSeek R1 matches GPT-4o on reasoning benchmarks; Llama 4 Maverick rivals Gemini Pro
- Multimodal goes mainstream: Most new models ship with vision capabilities as standard
- Context windows explode: Llama 4 Scout supports 10M token context; Gemini supports 1M
- Price war: API prices dropped 50-80% year-over-year; GPT-4o-mini at $0.15/1M tokens
- Reasoning models: OpenAI o-series and DeepSeek R1 popularize chain-of-thought reasoning as a product feature
Data compiled from official sources, paperswithcode, and independent benchmarks. Prices as of May 2026 and subject to change.
