OpenAI Pricing Explained (2026 Update)

February 20, 2026 9 min read

OpenAI's pricing can be confusing with multiple models, token types, and special rates. This complete breakdown explains exactly how much you'll pay for GPT-4o, GPT-4 Turbo, GPT-3.5, embeddings, fine-tuning, and more in 2026.

💡 Quick Reference

  • GPT-4o: $2.50 input / $10 output per 1M tokens
  • GPT-4 Turbo: $10 input / $30 output per 1M tokens
  • GPT-3.5 Turbo: $0.50 input / $1.50 output per 1M tokens

Language Models Pricing

Model Input (per 1M) Output (per 1M) Context
GPT-4o $2.50 $10.00 128K
GPT-4o (Batch API) $1.25 $5.00 128K
GPT-4 Turbo $10.00 $30.00 128K
GPT-3.5 Turbo $0.50 $1.50 16K
GPT-3.5 Turbo (16K) $3.00 $4.00 16K

Understanding Tokens

Pricing is per token, not per word. On average:

  • 1 token ≈ 0.75 words (English)
  • 100 tokens ≈ 75 words
  • 1,000 tokens ≈ 750 words

Example: A 500-word article = ~670 tokens. With GPT-4o at $2.50/1M input tokens, that's $0.001675 to process.

Cost Examples

Example 1: Customer Support Chatbot

Usage: 10,000 conversations/day

  • Avg input: 200 tokens (customer question + context)
  • Avg output: 100 tokens (bot response)

With GPT-4o:

Input: 10K × 200 × 30 days = 60M tokens/month

Cost: 60M × $2.50 = $150/month

Output: 10K × 100 × 30 days = 30M tokens/month

Cost: 30M × $10 = $300/month

Total: $450/month

Example 2: Content Generation

Usage: 1,000 articles/month

  • Avg input: 500 tokens (prompt + outline)
  • Avg output: 2,000 tokens (1,500-word article)

With GPT-4o:

Input: 1K × 500 = 500K tokens

Cost: 0.5M × $2.50 = $1.25

Output: 1K × 2,000 = 2M tokens

Cost: 2M × $10 = $20

Total: $21.25/month

Embeddings Pricing

Model Price per 1M tokens
text-embedding-3-small $0.02
text-embedding-3-large $0.13
ada v2 $0.10

Fine-Tuning Costs

Fine-tuning allows you to customize models on your data. Costs split into training and usage:

Model Training Input Output
GPT-3.5 Turbo $8.00/1M $3.00/1M $6.00/1M

How to Save on OpenAI Costs

  1. Use Batch API: 50% discount for non-real-time tasks
  2. Optimize prompts: Shorter prompts = fewer tokens
  3. Use cheaper models: GPT-3.5 for simple tasks (95% cheaper than GPT-4 Turbo)
  4. Set max_tokens: Prevent unlimited output generation
  5. Cache responses: Don't re-process identical requests
  6. Track spending: Use AI Cost Monitor to identify cost drivers

Track Your OpenAI Spending

Real-time cost monitoring, budget alerts, and optimization insights.

Start Free →