Llama 4 Behemoth pricing
Llama 4 Behemoth is a premium-tier LLM from Meta. Here's the full token-price breakdown and what it actually costs per month at real coding workloads.
Token pricing
| Input tokens | $3 / 1M tokens |
| Output tokens | $15 / 1M tokens |
| Cached input | $0.30 / 1M tokens |
| Price tier | Premium |
What it costs per month
Estimated API cost at three typical AI-coding workloads (caching off — real bills are usually lower):
| Workload | Volume | Est. cost |
|---|---|---|
| Light (hobby) | 2M in / 0.5M out | $13.5/mo |
| Daily driver | 15M in / 4M out | $105/mo |
| Heavy / agentic | 80M in / 20M out | $540/mo |
Estimates assume the listed output price. Prompt caching ($0.30/1M) can cut input cost substantially on repeated context.
Cheaper alternatives
- Llama 3 (Ollama/Groq) — $0/1M input (28% SWE-bench)
- Qwen 3 (235B) — $0/1M input (53% SWE-bench)
- Qwen 3.6 — $0/1M input (57% SWE-bench)
Pair Llama 4 Behemoth with the right tools — Flowpicker flags model/IDE compatibility before you spend.
Open the stack planner →