LLM Costs per MTok

This list is deprecated. Try out these lists instead:
 
Provider
Model
Input ($/MTok)
Output ($/MTok)
Google
gemini 1.5 flash-8b
$0.0375
$0.15
Replicate / Meta
llama-3-8b
$0.05
$0.25
Google
gemini 1.5 flash
$0.075
$0.30
Deepseek
deepseek-chat
$0.14
$0.28
OpenAI
gpt-4o-mini
$0.15
$0.6
Together / Meta
llama-3-8b
$0.2
$0.2
OpenAI
finetuned gpt-4o-mini
$0.3
$1.2
Together / Mistral
mixtral-8x7b
$0.6
$0.6
Together / Meta
llama-3-70b
$0.9
$0.9
Anthropic
claude-3.5 haiku
$1
$5
Google
gemini 1.5 pro
$1.25
$5
Fireworks AI / Meta
llama-3.1-405b
$3
$3
OpenAI
o1-mini
$3
$12
Anthropic
claude-3.5 sonnet
$3
$15
OpenAI
gpt-4o
$5
$15
OpenAI
gpt-4-turbo
$10
$30
OpenAI
o1-preview
$15
$60
Anthropic
claude-3 opus
$15
$75
Last updated: Jan 8 2025

FAQ

Q: What is a MTok?
A: 1 million tokens
Q: Are tokens the same across providers? Does this comparison make sense?
A: No, tokens are different, but they should be close enough.
  • Comparing Anthropic vs OpenAI, Anthropic has no public tokenizer, but claims their tokens are ~3.5 English chars per token. OpenAI claims their tokens are ~4 English chars per token. So perhaps Anthropic prices in the above table are an underestimate, and should be increased by 14%.
  • Comparing Mixtral vs OpenAI - the Mixtral vocabulary size is 32000 vs OpenAI’s 100,256. This might imply that Mixtral costs too are underestimated, but I’m not entirely sure how big the difference is.