Vespa
Context / RAG ยท Production-grade hybrid search (vector + BM25 + ranking) at billions-of-docs scale
At a glance
| Setup effort | High |
| Released | 2017 |
| Open source | Yes |
| Hosting | Both |
| Privacy | Configurable |
| Update mode | Real-time |
| Staleness | manual |
| Index type | Embeddings + BM25 |
| Index limit | Very large |
| Capabilities | Embedding search, BM25, Hybrid ranking, Tensor expressions, Real-time indexing, Self-host or Vespa Cloud |
What Vespa does
Embedding search, BM25, Hybrid ranking, Tensor expressions, Real-time indexing, Self-host or Vespa Cloud
Best for
Production-grade hybrid search (vector + BM25 + ranking) at billions-of-docs scale
Works well with
LLM Provider / Model
Integration
Agent / Orchestration
Conflicts & caveats
- โ ๏ธ SWE-agent with on-demand context "Vespa" may act on stale code โ prefer real-time context (Cursor @codebase, Greptile, GitHub Copilot indexing, Augment Context, CocoIndex, turbopuffer).
Build a full stack around Vespa โ Flowpicker shows compatibility warnings before you commit.
Open the stack planner โ