OpenAI Pricing Explained (2026 Update)
OpenAI's pricing can be confusing with multiple models, token types, and special rates. This complete breakdown explains exactly how much you'll pay for GPT-4o, GPT-4 Turbo, GPT-3.5, embeddings, fine-tuning, and more in 2026.
💡 Quick Reference
- GPT-4o: $2.50 input / $10 output per 1M tokens
- GPT-4 Turbo: $10 input / $30 output per 1M tokens
- GPT-3.5 Turbo: $0.50 input / $1.50 output per 1M tokens
Language Models Pricing
| Model | Input (per 1M) | Output (per 1M) | Context |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | 128K |
| GPT-4o (Batch API) | $1.25 | $5.00 | 128K |
| GPT-4 Turbo | $10.00 | $30.00 | 128K |
| GPT-3.5 Turbo | $0.50 | $1.50 | 16K |
| GPT-3.5 Turbo (16K) | $3.00 | $4.00 | 16K |
Understanding Tokens
Pricing is per token, not per word. On average:
- 1 token ≈ 0.75 words (English)
- 100 tokens ≈ 75 words
- 1,000 tokens ≈ 750 words
Example: A 500-word article = ~670 tokens. With GPT-4o at $2.50/1M input tokens, that's $0.001675 to process.
Cost Examples
Example 1: Customer Support Chatbot
Usage: 10,000 conversations/day
- Avg input: 200 tokens (customer question + context)
- Avg output: 100 tokens (bot response)
With GPT-4o:
Input: 10K × 200 × 30 days = 60M tokens/month
Cost: 60M × $2.50 = $150/month
Output: 10K × 100 × 30 days = 30M tokens/month
Cost: 30M × $10 = $300/month
Total: $450/month
Example 2: Content Generation
Usage: 1,000 articles/month
- Avg input: 500 tokens (prompt + outline)
- Avg output: 2,000 tokens (1,500-word article)
With GPT-4o:
Input: 1K × 500 = 500K tokens
Cost: 0.5M × $2.50 = $1.25
Output: 1K × 2,000 = 2M tokens
Cost: 2M × $10 = $20
Total: $21.25/month
Embeddings Pricing
| Model | Price per 1M tokens |
|---|---|
| text-embedding-3-small | $0.02 |
| text-embedding-3-large | $0.13 |
| ada v2 | $0.10 |
Fine-Tuning Costs
Fine-tuning allows you to customize models on your data. Costs split into training and usage:
| Model | Training | Input | Output |
|---|---|---|---|
| GPT-3.5 Turbo | $8.00/1M | $3.00/1M | $6.00/1M |
How to Save on OpenAI Costs
- Use Batch API: 50% discount for non-real-time tasks
- Optimize prompts: Shorter prompts = fewer tokens
- Use cheaper models: GPT-3.5 for simple tasks (95% cheaper than GPT-4 Turbo)
- Set max_tokens: Prevent unlimited output generation
- Cache responses: Don't re-process identical requests
- Track spending: Use AI Cost Monitor to identify cost drivers
Track Your OpenAI Spending
Real-time cost monitoring, budget alerts, and optimization insights.
Start Free →