Home โ€บ Tools โ€บ LLM Provider / Model โ€บ Llama 4 Behemoth

Llama 4 Behemoth

LLM Provider / Model ยท Self-hosted frontier reasoning, complex agentic coding, multimodal analysis

At a glance

Input price$3
Output price$15
Cache price$0.30
Price tierPremium
Context window1M+
Max output32K
Context tier500K+
Speed tierSlow/Reasoning
Latencyhigh
Knowledge cutoffAug 2025
ModalityMultimodal (vision)
Model IDllama-4-behemoth
ProviderMeta
HumanEval95%
MMLU91%
SWE-Bench74%
BenchmarkFrontier open-weight model competitive with closed labs
ReleasedNov 2025
HostingOpen/Self-host
CapabilitiesVision, Tool use, MoE, Function calling, Streaming, 2T params

What Llama 4 Behemoth does

Vision, Tool use, MoE, Function calling, Streaming, 2T params

Best for

Self-hosted frontier reasoning, complex agentic coding, multimodal analysis

Works well with

Conflicts & caveats

Build a full stack around Llama 4 Behemoth โ€” Flowpicker shows compatibility warnings before you commit.

Open the stack planner โ†’