Hybrid RAG: dense + keyword (BM25) + rerank, with contextual chunking
high confidenceMostly text and semantic search. The strong, simple baseline: hybrid retrieval plus reranking, with contextual chunking, beats naive vector-only and is cheaper than a graph.
Why (from your canvas)
- text-first, semantic search
Alternatives
- RAPTOR (if documents are long and questions need the big picture)
- GraphRAG / HippoRAG (if you add entities + relationships)
- Vision-native (if documents are visually rich)
Always-good baseline
- Hybrid retrieval: dense (semantic) + keyword (BM25), then rerank.
- Contextual or late chunking so passages keep their context.
- Matryoshka embeddings so you can trade accuracy for speed/storage.
- Keep a source trail (provenance) on every fact for citations.