Document Processing
- Docling / Unstructured / PyMuPDF / Llamaparse / Azure Document Intelligence
Chunking + Metadata
- LangChain/LlamaIndex/Chonkie/Doclings chunkers
- GLiNER for metadata extraction
Embeddings
- BGE-M3 / Voyage / Cohere v3
- Text-Embedding-3 (OpenAI)
Vector Database
- Milvus/Zilliz (HNSW/IVF)
- Qdrant / Weaviate
Hybrid Retrieval
- BM25 (sparse)
- SPLADE++ (learned sparse)
- Dense embeddings
- Late interaction (ColBERT)
Reranking
- BGE-reranker-v2
- Cohere rerank
- ColBERTv2
LLM Serving
- vLLM / Ollama
- TGI / OpenAI API
Orchestration
- LangChain / LlamaIndex
- Haystack / DSPy
Production Ops
- Eval: RAGAS, DeepEval, Opik
- Logging: LangSmith, Phoenix, W&B
- RBAC: FastAPI + Auth0/Cognito
- Backups: Vector DB snapshots + S3
- Deploy: AWS (ECS/Lambda) / Modal
- Monitoring: Prometheus + Grafana