Hermes 4 pricing
Hermes 4 is a mid-tier LLM from Nous Research. Here's the full token-price breakdown and what it actually costs per month at real coding workloads.
Token pricing
| Input tokens | $0.90 / 1M tokens |
| Output tokens | $2.70 / 1M tokens |
| Cached input | $0.10 / 1M tokens |
| Price tier | Mid |
What it costs per month
Estimated API cost at three typical AI-coding workloads (caching off — real bills are usually lower):
| Workload | Volume | Est. cost |
|---|---|---|
| Light (hobby) | 2M in / 0.5M out | $3.2/mo |
| Daily driver | 15M in / 4M out | $24.3/mo |
| Heavy / agentic | 80M in / 20M out | $126/mo |
Estimates assume the listed output price. Prompt caching ($0.10/1M) can cut input cost substantially on repeated context.
Cheaper alternatives
- Llama 3 (Ollama/Groq) — $0/1M input (28% SWE-bench)
- Qwen 3 (235B) — $0/1M input (53% SWE-bench)
- Qwen 3.6 — $0/1M input (57% SWE-bench)
Pair Hermes 4 with the right tools — Flowpicker flags model/IDE compatibility before you spend.
Open the stack planner →