AI and LLM Cost Calculator
Compare GPT-4.1, Claude Sonnet 4.6, Gemini 2.5, DeepSeek, Mistral, and self-hosted Llama. Find the right model for your budget and use case.
AI and LLM Cost Calculator
Compare GPT-4.1, Claude Sonnet 4.6, Gemini 2.5, DeepSeek, Mistral, and self-hosted Llama. Enter your usage and see the monthly cost for every major model.
Quick start: pick a use case
Cheapest: Llama 3.1 8B (Groq)
$1.33/month · $0.044 per 1K requests
API Models
Sorted by monthly cost · no infrastructure to manage · ~ = estimated price
| Model | Monthly | Per 1K reqs |
|---|---|---|
GroqBudget Llama 3.1 8B Lowest-cost API option. Ultra-fast inference via Groq. | $1.33 | $0.044 |
MistralBudget Mistral Small 3 European provider. Excellent value for structured tasks. | $4.35 | $0.145 |
DeepSeekBalanced DeepSeek V3 GPT-4 class quality at a fraction of the cost. | $4.41 | $0.147 |
GroqBalanced Llama 3.3 70B Open-source via Groq. Extremely fast, great value. | $13.91 | $0.464 |
OpenAIBudget GPT-4.1 mini 1M token context. Practical replacement for GPT-4o mini. | $22.20 | $0.740 |
MistralBalanced Mistral Medium 3 Strong GDPR compliance story. Good mid-tier option for EU teams. | $27.00 | $0.900 |
DeepSeekReasoning DeepSeek R1 o1-level reasoning. Open weights, ~96% cheaper than o1. | $30.41 | $1.01 |
GoogleBudget Gemini 2.5 Flash 1M token context. Fast and cost-effective at scale. | $32.25 | $1.07 |
OpenAIReasoning o4-mini Fast, affordable reasoning. Best value in the o-series. | $61.05 | $2.04 |
AnthropicBudget Claude Haiku 4.5 Fastest Claude. Great for high-volume, latency-sensitive tasks. | $67.50 | $2.25 |
OpenAIBalanced GPT-4.1 1M token context window. Best for long document tasks. | $111 | $3.70 |
OpenAIReasoning o3 Full o3 reasoning, significantly repriced from launch. | $111 | $3.70 |
OpenAIPowerful GPT-5 OpenAI's frontier model. Cheaper to prompt than GPT-4.1. | $129 | $4.31 |
GoogleBalanced Gemini 2.5 Pro Best-in-class for long context, up to 1M tokens. | $129 | $4.31 |
AnthropicBalanced Claude Sonnet 4.6 Top choice for coding, analysis, and agentic tasks. | $203 | $6.75 |
AnthropicPowerful Claude Opus 4.6 Anthropic's most capable model. 67% cheaper than Claude 3 Opus. | $338 | $11.25 |
Self-hosted (Open Source)
Fixed monthly GPU cost. Per-token cost drops as volume grows.
Llama 3.1 8B
1x A10G (24 GB VRAM)
$1100
fixed / month
Effective $/1M tokens
$56.41
Monthly tokens
19.5M
GPU utilization
0.9%
Break-even vs APIs
824.0x current volume
⚠ GPU would be mostly idle. Self-hosting only makes sense at much higher volume.
Cheapest self-hosted option. Good for classification, simple Q&A.
Llama 3.3 70B
2x A100 80GB
$5200
fixed / month
Effective $/1M tokens
$267
Monthly tokens
19.5M
GPU utilization
3.7%
Break-even vs APIs
3895.1x current volume
⚠ GPU would be mostly idle. Self-hosting only makes sense at much higher volume.
GPT-4 class quality. Becomes cost-effective at high volume.
DeepSeek R1 (self-hosted)
8x H100 80GB
$32.0K
fixed / month
Effective $/1M tokens
$1641
Monthly tokens
19.5M
GPU utilization
24.7%
Break-even vs APIs
23970.0x current volume
Full reasoning model on your own infra. Only viable at massive scale.
Llama 3.1 405B
8x A100 80GB
$21.0K
fixed / month
Effective $/1M tokens
$1077
Monthly tokens
19.5M
GPU utilization
14.8%
Break-even vs APIs
15730.3x current volume
Frontier-class open model. Only viable at very large scale.
Prices are published rates as of April 2026. Verify current rates on each provider's pricing page before committing spend. Self-hosted costs use AWS GPU instance pricing (on-demand, 24/7).
How it works
Describe your usage
Enter how many requests your app makes per day, and the average input and output token counts per request.
Pick a preset or enter custom values
Use presets for common use cases (chatbot, RAG, code assistant) or enter your own numbers.
Compare models side by side
See monthly cost, cost per 1K requests, and context window across every major model. Self-hosted GPU costs are included.
Models covered
OpenAI
GPT-4.1 mini, GPT-4.1, GPT-5, o4-mini, o3
Anthropic
Claude Haiku 4.5, Claude Sonnet 4.6, Claude Opus 4.6
Gemini 2.5 Flash, Gemini 2.5 Pro
Mistral
Mistral Small 3, Mistral Medium 3
DeepSeek
DeepSeek V3, DeepSeek R1
Groq (open source)
Llama 3.1 8B, Llama 3.3 70B via Groq inference
Also estimate your cloud infrastructure costs
LLM costs are one part of your bill. Compare AWS, GCP, and Azure for the rest.