Home›AI Tools›LLM API Cost Calculator

AI Tools · API budget planning

LLM API Cost Calculator

Estimate the cost of GPT, Claude, Gemini, Grok, DeepSeek, Llama, and Mistral API workloads. Compare input, output, cached prompt, batch, monthly, yearly, and blended token costs in one place.

Pricing disclaimer: Prices are local planning data last reviewed April 27, 2026. Provider pricing changes often, so verify official pages before production budgeting.

Workload inputs

Input tokens/request

Output tokens/request

Requests per day

Billing days/month

Cached input 0%

Long-context surcharge Batch pricing when available

Use-case presets

Choose models to compare

Vendor Search

Cost comparison

Model	Input	Output	Request	Day	Month	Year	Blended / 1M	Cache saved	Context used

Why this calculator is useful

Budget before launch

Estimate support chatbot, RAG, coding agent, summarization, and classification costs before you commit to a provider or model tier.

Compare the real workload mix

Input and output tokens often have different prices. Blended cost shows the effective price for your actual ratio, not a generic model headline price.

Model caching savings

Repeated system prompts, policies, examples, and reference material can qualify for cached input rates on some providers. Use the cache slider to test the savings.

Catch context risks

The table flags how much of each model's context window your request uses, so you can avoid oversized prompts or choose a larger-context model.

How to use this calculator

Estimate tokens per request. Use the token counter for exact text, or start from a preset if you only need a quick budget.
Enter traffic volume. Requests per day and billing days determine the monthly and yearly projection.
Adjust cache and batch settings. Use cached input when repeated prompt material is likely. Use batch pricing for offline jobs where latency is not important.
Select realistic models. Compare up to six models you can use in your app, then sort visually by monthly spend.
Export or share. Copy the share URL for teammates or download a CSV for a spreadsheet budget.

Pricing sources

These calculations use local pricing metadata so the page works offline. Always verify the exact model, region, caching rules, batch discounts, and long-context thresholds on official provider pages before relying on estimates.

OpenAI pricing Anthropic pricing Google AI pricing xAI models DeepSeek pricing Mistral pricing

Frequently asked questions

What is a token?

A token is the billing unit most LLM APIs use. It can be a word, word fragment, punctuation mark, or code symbol.

Why are output tokens more expensive?

Output tokens require the provider to run the model while generating each response. That serving cost is often higher than reading input.

What is cached input pricing?

Cached input pricing is a lower rate for repeated prompt content that the provider can reuse. It is useful for long system prompts, examples, policy text, and repeated reference material.

What is blended cost per 1M tokens?

Blended cost combines input, cached input, and output token prices into one effective price for your entered workload.

How accurate are these numbers?

The formulas are exact for the local prices, but provider pricing changes. Treat results as planning estimates and verify official pricing before production decisions.

LLM API Cost Calculator

Workload inputs

Use-case presets

Choose models to compare

Cost comparison

Why this calculator is useful

Budget before launch

Compare the real workload mix

Model caching savings

Catch context risks

How to use this calculator

Pricing sources

Frequently asked questions

Related AI Tools

AI Hashtag Generator

Citation Generator

Free Text-to-Speech

Free Voice-to-Text

Hindi to English Translator

AI Token Counter

Context Window Comparator

AI Model Comparison Table

AI Prompt Builder

AI Prompt Library

AI Prompt Engineering Cheat Sheet