LLM API Pricing & Cost Calculator
Compare API costs across OpenAI, Anthropic, Google, and DeepSeek. Model caching discounts and batch requests.
Request Parameters
Add Models
Cost Comparison
Model Specs (per 1M tokens)
| Model | Input | Cached Input | Output | |
|---|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $0.10 | $5.00 | |
| Claude Opus 4.7 | $5.00 | $0.50 | $25.00 | |
| Claude Sonnet 4.6 | $3.00 | $0.30 | $15.00 | |
| Gemini 3.1 Pro (<=200k) | $2.00 | $0.20 | $12.00 | |
| Gemini 3.5 Flash | $0.50 | $0.25 | $3.00 | |
| GPT-5.4 | $2.50 | $0.25 | $15.00 | |
| GPT-5.5 | $5.00 | $0.50 | $30.00 |
About this tool
The LLM API Pricing Calculator helps developers and startups estimate the cloud costs of integrating major AI models like GPT-5.4, Claude Sonnet 4.6, or DeepSeek V4. You can tweak parameters such as input/output token counts, prompt caching discounts, and batch processing to see exactly how your monthly bill changes.
Unlike static pricing tables, this calculator models the compounding effect of multi-turn agentic workflows. By toggling 'Agentic Loop', you can see how Anthropic and Google's aggressive caching discounts flip the economics of running autonomous agents.
Prompt and Context Caching
Caching is the single most important variable in modern AI economics. If you send the same 100K token system prompt repeatedly, caching means you only pay full price for it once, and a tiny fraction (10% to 50%) for subsequent hits. The calculator lets you estimate what percentage of your input tokens will be cached.
Batch API Mode
If your application does not require real-time latency, you can route requests through a Batch API. This guarantees a 50% discount across nearly all major platforms in exchange for a 24-hour turnaround window.
Standard Chat vs Agentic Workflows
A standard chat prompt has a predictable 3:1 input-to-output ratio. But autonomous agents accumulate context: each turn they add their previous thought process to the prompt, making the input grow exponentially. Caching prevents these loops from destroying your API budget.
Popular model pricing
Pre-computed API pricing calculators for the most heavily debated AI models.
Frequently asked questions
Does OpenAI support prompt caching?
What is Batch API and when should I use it?
Why is prompt caching so important for Agentic workflows?
Are input and output tokens billed differently?
Related tools
LLM VRAM Calculator
Calculate the VRAM needed to run or fine-tune any LLM at any quantization.
Use tool ➜Power Cost Estimator
Estimate annual electricity costs for your PC, Server, or TV.
Use tool ➜Data Transfer Calculator
Estimate transfer times for files over USB, WiFi, Ethernet, and more.
Use tool ➜JSON Formatter
Validate, format, and minify JSON data with syntax highlighting.
Use tool ➜