LLM Provider / Model for AI coding
Every llm provider / model option Flowpicker tracks, with pricing, setup notes, and compatibility for each.
- Claude Sonnet 4.6 — Day-to-day coding, fast agentic loops, balanced cost/quality
- Claude Opus 4.7 — Complex refactors, agentic coding, hard debugging, deep reasoning
- GPT-4o — Multimodal tasks, fast chat, broad general use
- Gemini 2.x — Huge documents, video/audio understanding, long-context retrieval
- Llama 3 (Ollama/Groq) — Local/offline use, privacy-sensitive work, no-cost experimentation
- Deepseek — Cheap high-quality coding, bulk classification, self-host for privacy
- Claude Haiku 4.5 — High-volume quick tasks, cost-sensitive agentic loops, inline completions
- DeepSeek V4 Flash — Ultra-cheap high-quality coding, bulk classification, context-heavy tasks
- DeepSeek V4 Pro — Complex reasoning, agentic coding, hard debugging with long context
- Grok 4.3 — Fast general-purpose coding with native web and X search agent capabilities
- Grok 4.20 — Deep reasoning, multi-step agentic coding, massive context tasks
- Grok 4-1 Fast — Ultra-cheap fast reasoning for bulk agentic coding and large context retrieval
- Mistral Large 3 — Top open-weight multipurpose model, multilingual coding, self-hosting with frontier quality
- Codestral — Specialized code completion and generation, FIM-aware coding, fast IDE completions
- Devstral 2 — Open-source coding agent, SWE-bench tasks, complex multi-file refactors
- Ministral 3 14B — Local coding assistant on laptop, edge deployments, quick completions
- Mistral Medium 3.5 — Frontier-class agentic coding and reasoning at lower cost than Opus/GPT top tiers
- Qwen 3 (235B) — Powerful self-hosted reasoning, multilingual coding, agentic workflows with MCP
- Qwen 3.6 — Agentic coding with sustained multi-turn reasoning, frontend generation, local development
- Qwen 3 Coder — Open-source coding specialist, long-context code generation, FIM completions on local hardware
- Qwen 3 Coder Next — Next-gen open-source coding model optimized for agentic coding and local dev workflows
- Gemma 4 31B — Frontier open-weights on workstation, agentic coding, reasoning, local multimodal tasks
- Gemini 3 Flash — Fast frontier intelligence, near real-time coding assistance, multimodal agentic loops
- Command R+ — Enterprise RAG, multilingual coding, document understanding with tool use
- Command R — Cost-effective enterprise RAG, multilingual understanding at budget pricing
- Llama 4 Maverick — Latest open-weights from Meta, large context, self-hosted coding with vision
- Amazon Nova Pro — AWS-integrated coding, enterprise Bedrock deployments, multimodal tasks
- Amazon Nova Lite — Cheapest AWS-native model for high-volume coding tasks, Bedrock deployments
- Amazon Nova Micro — Ultra-low-cost AWS-native for simple coding, routing, high-throughput tasks
- Phi-4 — Extremely small self-hosted coding model, edge devices, resource-constrained environments
- o3 — Hardest coding problems, complex multi-step reasoning, advanced debugging
- o4-mini — Cheap fast reasoning, agentic coding loops, high-volume tasks
- GPT-4.1 — Long-context coding, instruction-following, broad general use at mid price
- Gemini 2.5 Pro — Advanced reasoning, multimodal workflows, massive context tasks, agentic coding
- Gemini 2.5 Flash — Fast cost-efficient inference with huge context, real-time applications
- GPT-OSS 120B — Best open-weight model for production coding, configurable reasoning for quality/speed trade-offs, fine-tunable on single H100, agentic workflows
- GPT-OSS 20B — Local development, consumer hardware, fast reasoning loops, cost-effective agentic coding, laptop-friendly open-weight model
- GLM 5.1 — Top-tier agentic engineering, complex multi-step workflows, sustained long-horizon coding, self-improving agent loops
- Laguna XS.2 — Local agentic coding on Mac/laptop (runs on 36GB), SWE-bench tasks, long-horizon autonomous coding, Zed/JetBrains integration via ACP
- MiniMax M2.7 — Professional software engineering, SRE incident response, multi-agent collaboration, self-improving coding workflows, 100+ round autonomous optimization
- MiMo V2.5 Pro — Highest open-weight coding performance, 1M context agentic tasks, complex multi-step engineering, long-context reasoning
- Ling 2.6 (1T) — Open-source SOTA execution-heavy tasks, enterprise agent workflows, production coding with optimized token efficiency, AIME-level reasoning
- Granite 4.1 30B — Enterprise coding with tool calling, RAG workflows, multilingual development, FIM code completions, IBM ecosystem, governed deployments
- GPT-5.5 — Frontier reasoning, agentic coding, long-context refactors, multimodal analysis, replaces GPT-5.4 as default flagship
- GPT-5.5 Pro — Hardest reasoning problems, math olympiad, research-grade analysis, mission-critical coding tasks where cost is no object
- GPT-5.4 — Production agentic coding, multi-step tool use, balanced cost/quality on long-context tasks, GPT-5.5 alternative at lower cost
- Kimi K2.6 — Long-horizon coding, UI/UX generation from prompts, multi-agent orchestration, cost-optimized frontier-class workloads (~80% cheaper than GPT-5.5)
- Doubao Seed 2.0 Pro — Cost-effective frontier coding, Codeforces-level competitive programming (3020 rating), AIME math (98.3%), production agentic workflows
- Doubao Seed 2.0 Code — Cheapest frontier-class coding model on market, high-throughput code completion, CI-driven agent loops, bulk refactors at minimal cost
- Command A — Enterprise RAG, citation-grounded responses, multilingual support, agentic workflows with strict tool-use accuracy, regulated industries
- Hy3 Preview — Agent-led tasks in development environments, CodeBuddy/WorkBuddy workflows, low-latency production coding (54% TTFT reduction), open-weight frontier alternative
- Step 3.5 Flash — High-throughput low-cost reasoning, real-time agent loops, budget-tier production deployments, fastest open-weight reasoning model in its price class
- Nemotron 3 Super — Top-tier open-weight agentic coding, 1M-context refactors, GPU-rich self-hosted deployments, NVIDIA ecosystem (NIM/NeMo), governed enterprise environments
- Nemotron 3 Nano Omni — Local omni-modal workflows (video, audio, document), edge deployment on consumer GPUs, multimodal RAG, on-device assistants
- Qwen 3.5 397B-A17B — Frontier open-weight reasoning, native multimodal tasks (text/image/video co-trained), commercial use under Apache 2.0, Alibaba Cloud Model Studio (1M ctx hosted)
- Qwen 3.6 27B — Single-GPU agentic coding (fits on 1x H100), workstation deployment, beats much larger MoE models on agentic tasks, Apache 2.0 commercial use
- DeepSeek R2 — Hard reasoning at 1/20th o3 cost, math (AIME 83.2%, MATH 98.1%), competitive programming, research-grade analysis on a budget
- Jamba Mini 2 — Cost-efficient long-context workflows, document Q&A, summarization at 256K, enterprise on-prem deployment, 2.5x throughput vs Transformer-only
- Jamba Large 2 — Enterprise long-document workflows, regulated industries needing on-prem deployment, secure RAG at 256K, financial/legal analysis, beats Llama 3.1 405B on Arena Hard
- Gemini 3 Pro — Long-horizon agentic tasks, generative UI, multi-modal reasoning, Antigravity-driven workflows
- Gemini 3 Deep Think — Hardest reasoning, research analysis, math olympiad and competitive programming
- GPT-5.1 — Default daily-driver coding agent with adaptive reasoning and warmer chat tone
- GPT-5.1 Codex — Codex CLI and long-horizon coding agents; engineered for terminal-driven workflows
- GPT-5.1 Codex Max — Multi-hour, cross-file engineering tasks where context compaction matters; enterprise Codex CLI use
- Claude Haiku 4.5 (Fast) — High-throughput coding sidekick, autocomplete, batch jobs, and low-latency Claude Code use
- Grok 5 — Real-time web/X context, long-context analysis, and agentic browsing
- Grok Code Fast 2 — High-volume agentic coding where latency and cost trump max intelligence
- DeepSeek V4 — Low-cost coding LLM with self-host option; strong English + Chinese coding capabilities
- Qwen 3 Max — Long-context coding, multilingual codebases, China-region deployments
- Llama 4 Behemoth — Self-hosted frontier reasoning, complex agentic coding, multimodal analysis
- Llama 4 Scout — On-prem 10M-token context analysis, doc/codebase RAG without external chunking
- Mistral Large 4 — EU-residency coding workloads, multilingual engineering, GDPR-sensitive deployments
- Codestral 2 — In-IDE completions, fill-in-the-middle, code refactoring, BYO-model integrations
- Kimi K3 — Agentic coding at low cost, ultra-long context, China-region deployments
- GLM 5 Air — Cheap self-hosted coding agent, autocomplete, batch inference at scale
- Hermes 4 — Self-hosted assistants with steerable persona and explicit chain-of-thought support
- Magistral 2 — Reasoning workloads where chain-of-thought transparency and EU residency matter
- Command R 2 — RAG and tool-using agents in regulated enterprises with on-prem deployment options
- Jamba 1.7 — Long-context RAG with low memory; cost-conscious enterprise deployments
- Phi-5 — On-device AI, edge inference, mobile/embedded coding sidekicks
- Amazon Nova Premier 2 — AWS-native enterprise agents, Bedrock integrations, long-context analysis
- Yi-3 Lightning — Cheap, fast bilingual coding tasks and high-volume Chinese-English mixed workloads
- Inflection 3 — Enterprise deployments requiring on-prem AI with productivity tuning