LLM Model Comparison
Compare pricing and context windows across all major LLM providers. Sorted by input price (cheapest first). Prices are per 1 million tokens.
| Model | Provider | Input $/1M | Output $/1M | Context | Category |
|---|---|---|---|---|---|
| GPT-4.1-nano | OpenAI | $0.10 | $0.40 | 1M | budget |
| GPT-4o-mini | OpenAI | $0.15 | $0.60 | 128k | fast |
| Gemini 2.5 Flash | $0.15 | $0.60 | 1M | fast | |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 64k | budget |
| GPT-4.1-mini | OpenAI | $0.40 | $1.60 | 1M | fast |
| Llama 3.3 70B (Groq) | Groq | $0.59 | $0.79 | 128k | fast |
| Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 | 200k | fast |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | flagship | |
| GPT-4.1 | OpenAI | $2.00 | $8.00 | 1M | flagship |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128k | flagship |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | 200k | flagship |
Cost optimization tips
Input tokens are always cheaper than output tokens. For cost optimization, use shorter prompts with detailed system instructions and cache common prefixes.
Budget models — Great for classification, extraction, and simple tasks. GPT-4.1-nano and Gemini 2.5 Flash offer excellent price/performance.
Flagship models — Use for complex reasoning, code generation, and tasks where quality matters more than cost.
Note: Prices reflect standard API pricing as of March 2026. Volume discounts, committed use contracts, and cached token pricing may reduce costs further.