Sentence Transformers QuickstartΒΆ
What: Load the all-MiniLM-L6-v2 model, encode three sentences into 384-dimensional embeddings, and compute a pairwise similarity matrix.
Why: The sentence-transformers library is the most popular open-source framework for generating sentence and text embeddings. It wraps HuggingFace transformers with a high-level encode() API and provides 100+ pretrained models optimized for different tasks (similarity, retrieval, clustering). The all-MiniLM-L6-v2 model is the go-to general-purpose model: 384 dimensions, 22M parameters, and strong performance across English benchmarks.
How: model.encode(sentences) tokenizes each sentence, runs it through the 6-layer MiniLM transformer, applies mean pooling over token embeddings, and returns a NumPy array of shape (num_sentences, 384). The model.similarity() method computes the full cosine similarity matrix between two sets of embeddings, returning a tensor where entry \((i, j)\) is the similarity between sentence \(i\) and sentence \(j\).
Connection: Sentence Transformers is the backbone of most open-source semantic search, RAG, and clustering implementations. Libraries like LangChain, LlamaIndex, and Haystack all integrate with it for embedding generation.
# https://sbert.net/docs/quickstart.html#sentence-transformer
from sentence_transformers import SentenceTransformer
# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
# [0.6660, 1.0000, 0.1411],
# [0.1046, 0.1411, 1.0000]])