Gemini 2.5 Flash

LLM Provider / Model · Fast cost-efficient inference with huge context, real-time applications

At a glance

Input price	$0.30
Output price	$2.50
Price tier	Budget
Context window	1M+
Max output	65K
Context tier	500K+
Speed tier	Fast
Latency	fast
Knowledge cutoff	Jan 2025
Modality	Multimodal (vision + audio)
Model ID	gemini-2.5-flash
Provider	Google
HumanEval	89%
MMLU	86%
Benchmark	strong for speed/cost
Released	May 2025
Hosting	Closed/API
Capabilities	Vision, Audio, Tool use, Streaming, Structured output, Long context

Vision, Audio, Tool use, Streaming, Structured output, Long context

Fast cost-efficient inference with huge context, real-time applications

Trae's built-in model menu may not include Gemini 2.5 Flash — Trae primarily ships with Claude/GPT/Gemini/DeepSeek.
Kiro's spec mode is tuned for Claude models — Gemini 2.5 Flash may work but spec-driven flows are best with Claude.
OpenAI Codex (VS Code) is tied to OpenAI's GPT-5.1 Codex family — Gemini 2.5 Flash is not selectable inside this integration.
Claude Code only works with Claude models — your LLM is Gemini 2.5 Flash.
Devin uses Cognition's proprietary model internally — your selected Gemini 2.5 Flash will be ignored.
OpenAI Codex (Cloud) runs GPT-5.1 Codex internally — your selected Gemini 2.5 Flash will be ignored.
The Claude Code SDK is tuned for Claude models — Gemini 2.5 Flash may work via the model adapter but loses Claude-specific features (thinking, prompt caching, sub-agents).
Lovable uses a fixed internal model — your selected Gemini 2.5 Flash will be ignored.
Replit Agent 3 uses its own model selection internally — your selected Gemini 2.5 Flash will be ignored.

Build a full stack around Gemini 2.5 Flash — Flowpicker shows compatibility warnings before you commit.