Claude Haiku 4.5 (Fast)

LLM Provider / Model · High-throughput coding sidekick, autocomplete, batch jobs, and low-latency Claude Code use

At a glance

Input price	$1
Output price	$5
Cache price	$0.10
Price tier	Budget
Context window	200K
Max output	32K
Context tier	128K-500K
Speed tier	Fast
Latency	low
Knowledge cutoff	Oct 2025
Modality	Multimodal (vision)
Model ID	claude-haiku-4-5-fast
Provider	Anthropic
HumanEval	88%
MMLU	82%
SWE-Bench	55%
Benchmark	Best-in-class latency at Haiku-tier intelligence
Released	Oct 2025
Hosting	Closed/API
Capabilities	Vision, Tool use, Streaming, Structured output, Prompt caching, Low-latency mode

Vision, Tool use, Streaming, Structured output, Prompt caching, Low-latency mode

High-throughput coding sidekick, autocomplete, batch jobs, and low-latency Claude Code use

OpenAI Codex (VS Code) is tied to OpenAI's GPT-5.1 Codex family — Claude Haiku 4.5 (Fast) is not selectable inside this integration.
Gemini CLI is locked to Google Gemini models — your LLM is Claude Haiku 4.5 (Fast).
Devin uses Cognition's proprietary model internally — your selected Claude Haiku 4.5 (Fast) will be ignored.
OpenAI Codex (Cloud) runs GPT-5.1 Codex internally — your selected Claude Haiku 4.5 (Fast) will be ignored.
Jules runs Gemini models internally — your selected Claude Haiku 4.5 (Fast) will be ignored.
Lovable uses a fixed internal model — your selected Claude Haiku 4.5 (Fast) will be ignored.
Replit Agent 3 uses its own model selection internally — your selected Claude Haiku 4.5 (Fast) will be ignored.

Build a full stack around Claude Haiku 4.5 (Fast) — Flowpicker shows compatibility warnings before you commit.