DeepSeek V4 Flash pricing

DeepSeek V4 Flash is a budget-tier LLM from DeepSeek. Here's the full token-price breakdown and what it actually costs per month at real coding workloads.

Token pricing

Input tokens	$0.14 / 1M tokens
Output tokens	$0.28 / 1M tokens
Cached input	$0.003 / 1M tokens
Price tier	Budget

What it costs per month

Estimated API cost at three typical AI-coding workloads (caching off — real bills are usually lower):

Workload	Volume	Est. cost
Light (hobby)	2M in / 0.5M out	$0.42/mo
Daily driver	15M in / 4M out	$3.2/mo
Heavy / agentic	80M in / 20M out	$16.8/mo

Estimates assume the listed output price. Prompt caching ($0.003/1M) can cut input cost substantially on repeated context.

Cheaper alternatives

Llama 3 (Ollama/Groq) — $0/1M input (28% SWE-bench)
Qwen 3 (235B) — $0/1M input (53% SWE-bench)
Qwen 3.6 — $0/1M input (57% SWE-bench)

Pair DeepSeek V4 Flash with the right tools — Flowpicker flags model/IDE compatibility before you spend.

Open the stack planner →