Skip to content

Instantly share code, notes, and snippets.

@surister
Last active August 9, 2024 20:59
Show Gist options
  • Select an option

  • Save surister/0a666b95c27783b3329d777c0772783d to your computer and use it in GitHub Desktop.

Select an option

Save surister/0a666b95c27783b3329d777c0772783d to your computer and use it in GitHub Desktop.
There are already a bunch of hybrid search in haystack past conferences:
EU 2023: (Mastering Hybrid Search: Blending Classic Ranking Functions with Vector Search for Superior Search Relevance)[https://haystackconf.com/eu2023/talk-10/]
EU 2023: (Reciprocal Rank Fusion (RRF) or How to Stop Worrying about Boosting)[https://haystackconf.com/eu2023/talk-2/]
US 2024: (All Vector Search is Hybrid Search)[https://haystackconf.com/us2024/talk-1/]
US 2024: (Better Semantic Search with Hybrid (Sparse-Dense) Search)
# Doing hybrid search on your real-time data in pure SQL with CrateDB's index-all strategy.
Points to highlight:
- Hybrid search (convex/rrf) in pure SQL.
- Do the whole thing is live data (constantly ingesting new things)
- Showcase the index-all strategy
Description
-----------
There are many ways of doing Hybrid search, but what if I told you that you could do it with high resilience,
with pure SQL, fast queries without giving up flexibility and scalability.
CrateDB is a powerful distribute SQL database based on Apache Lucene, we can implement Hybrid search using
either convex or RRF as reranking methods in pure SQL, directly in your live data with fast ad-hoc queries
thanks to its index-all by default strategy.
In this talk we will:
* Have a quick overview of CrateDB and some of its interesting capabilities.
* Look at how we index every column by default.
* Show how to implement Hybrid search using only SQL.
* Show the demo where we apply Hybrid search in our real time data (constantly ingesting new rows) project.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment