Advanced RetrievalΒΆ
Hybrid SearchΒΆ
Combine vector search with keyword search!
Why Hybrid?ΒΆ
Vector: Semantic similarity
Keyword: Exact matches
Together: Best of both worlds
import numpy as np
import pandas as pd
from typing import List, Dict, Tuple
import json
import os
from pathlib import Path
Re-rankingΒΆ
Improving Retrieval Quality with a Second PassΒΆ
Initial vector search is fast but can be imprecise: it relies on a single dot product between compressed embeddings. Re-ranking adds a second stage where a cross-encoder model scores each (query, document) pair independently, considering the full interaction between query and document tokens. Cross-encoders are far more accurate than bi-encoders but too slow to run over the entire corpus, so the standard pattern is: first retrieve a broad candidate set (e.g., top 20) with fast vector search, then re-rank that set with the cross-encoder and return the refined top-\(k\). This two-stage approach gives you the speed of vector search with the precision of cross-attention.
# Pseudo-code for re-ranking
def rerank(query, initial_results):
# Use cross-encoder for precise scoring
scores = cross_encoder.predict([(query, doc) for doc in initial_results])
# Sort by new scores
reranked = sorted(zip(initial_results, scores), key=lambda x: x[1], reverse=True)
return reranked