LLM Model Comparison

Compare pricing and context windows across all major LLM providers. Sorted by input price (cheapest first). Prices are per 1 million tokens.

ModelProviderInput $/1MOutput $/1MContextCategory
GPT-4.1-nanoOpenAI$0.10$0.401Mbudget
GPT-4o-miniOpenAI$0.15$0.60128kfast
Gemini 2.5 FlashGoogle$0.15$0.601Mfast
DeepSeek V3DeepSeek$0.27$1.1064kbudget
GPT-4.1-miniOpenAI$0.40$1.601Mfast
Llama 3.3 70B (Groq)Groq$0.59$0.79128kfast
Claude Haiku 3.5Anthropic$0.80$4.00200kfast
Gemini 2.5 ProGoogle$1.25$10.001Mflagship
GPT-4.1OpenAI$2.00$8.001Mflagship
GPT-4oOpenAI$2.50$10.00128kflagship
Claude Sonnet 4Anthropic$3.00$15.00200kflagship

Cost optimization tips

Input tokens are always cheaper than output tokens. For cost optimization, use shorter prompts with detailed system instructions and cache common prefixes.

Budget models — Great for classification, extraction, and simple tasks. GPT-4.1-nano and Gemini 2.5 Flash offer excellent price/performance.

Flagship models — Use for complex reasoning, code generation, and tasks where quality matters more than cost.

Note: Prices reflect standard API pricing as of March 2026. Volume discounts, committed use contracts, and cached token pricing may reduce costs further.