Run this notebook: Open in Colab Open in Kaggle

Sentence Transformers Quickstart¶

What: Load the all-MiniLM-L6-v2 model, encode three sentences into 384-dimensional embeddings, and compute a pairwise similarity matrix.

Why: The sentence-transformers library is the most popular open-source framework for generating sentence and text embeddings. It wraps HuggingFace transformers with a high-level encode() API and provides 100+ pretrained models optimized for different tasks (similarity, retrieval, clustering). The all-MiniLM-L6-v2 model is the go-to general-purpose model: 384 dimensions, 22M parameters, and strong performance across English benchmarks.

How: model.encode(sentences) tokenizes each sentence, runs it through the 6-layer MiniLM transformer, applies mean pooling over token embeddings, and returns a NumPy array of shape (num_sentences, 384). The model.similarity() method computes the full cosine similarity matrix between two sets of embeddings, returning a tensor where entry \((i, j)\) is the similarity between sentence \(i\) and sentence \(j\).

Connection: Sentence Transformers is the backbone of most open-source semantic search, RAG, and clustering implementations. Libraries like LangChain, LlamaIndex, and Haystack all integrate with it for embedding generation.

# https://sbert.net/docs/quickstart.html#sentence-transformer

from sentence_transformers import SentenceTransformer

# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("all-MiniLM-L6-v2")

# The sentences to encode
sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]

# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
#         [0.6660, 1.0000, 0.1411],
#         [0.1046, 0.1411, 1.0000]])