Sparse Encoders: SPLADE and Learned Sparse RepresentationsΒΆ
What: Load a SPLADE sparse encoder model and generate sparse embeddings where most dimensions are zero, then compute similarity and measure sparsity statistics.
Why: Dense embeddings (like those from Sentence Transformers) represent text as compact vectors where every dimension is non-zero. Sparse embeddings take the opposite approach: they produce vectors with the same dimensionality as the vocabulary (e.g., 30,522 for BERTβs vocabulary) but with 99%+ of values being zero. Each non-zero dimension corresponds to a vocabulary term that the model considers relevant to the input, weighted by learned importance. This combines the semantic understanding of neural models with the interpretability and efficiency of traditional keyword-based methods like BM25.
How: SPLADE (Sparse Lexical and Expansion) applies a log-saturated activation to the MLM logits: \(w_j = \log(1 + \text{ReLU}(z_j))\) where \(z_j\) is the logit for vocabulary term \(j\). This produces a sparse vector where only relevant terms have non-zero weights. The model also performs term expansion β it can assign non-zero weight to related terms not in the original text (e.g., βautomobileβ for an input containing βcarβ).
Connection: Sparse encoders are increasingly used alongside dense embeddings in hybrid search systems. The sparse component provides keyword-level precision (exact term matching) while the dense component provides semantic recall (meaning-based matching). Systems like ColBERT, SPLADE, and Pineconeβs hybrid search combine both for state-of-the-art retrieval.
# https://sbert.net/docs/quickstart.html#sparse-encoder
from sentence_transformers import SparseEncoder
# 1. Load a pretrained SparseEncoder model
model = SparseEncoder("naver/splade-cocondenser-ensembledistil")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate sparse embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 30522] - sparse representation with vocabulary size dimensions
# 3. Calculate the embedding similarities (using dot product by default)
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 35.629, 9.154, 0.098],
# [ 9.154, 27.478, 0.019],
# [ 0.098, 0.019, 29.553]])
# 4. Check sparsity statistics
stats = SparseEncoder.sparsity(embeddings)
print(f"Sparsity: {stats['sparsity_ratio']:.2%}") # Typically >99% zeros
print(f"Avg non-zero dimensions per embedding: {stats['active_dims']:.2f}")