turbopuffer
Context / RAG ยท Cheap, fast object-storage-backed vector + full-text search for AI agents at scale
At a glance
| Setup effort | Low |
| Released | 2024 |
| Open source | No |
| Hosting | Cloud |
| Privacy | Cloud index |
| Update mode | On-demand |
| Staleness | manual |
| Index type | Embeddings + Hybrid |
| Index limit | Very large |
| Capabilities | Embedding search, Full-text BM25, Hybrid search, Namespaces, Metadata filtering, Serverless, Object-store backed |
What turbopuffer does
Embedding search, Full-text BM25, Hybrid search, Namespaces, Metadata filtering, Serverless, Object-store backed
Best for
Cheap, fast object-storage-backed vector + full-text search for AI agents at scale
Works well with
LLM Provider / Model
Integration
Agent / Orchestration
Conflicts & caveats
- Privacy conflict: Self-hosted Llama 3 (Ollama/Groq) sends code to cloud turbopuffer. Use local context (Continue indexing, ChromaDB, LanceDB, pgvector, Vespa self-hosted) for true privacy.
Build a full stack around turbopuffer โ Flowpicker shows compatibility warnings before you commit.
Open the stack planner โ