Milvus β Large-Scale Vector DatabaseΒΆ
Billion-Scale Similarity SearchΒΆ
Milvus is an open-source vector database engineered for massive scale, capable of indexing and searching billions of vectors with sub-second latency. It uses a distributed architecture with storage and compute separation, supporting multiple index types (IVF_FLAT, IVF_PQ, HNSW, ANNOY) that offer different trade-offs between recall, speed, and memory usage. Milvus is the go-to choice when your dataset outgrows single-node solutions β it powers production search and recommendation systems at companies handling hundreds of millions of embeddings.
InstallationΒΆ
The pymilvus package is the official Python SDK for Milvus. It communicates with a Milvus server over gRPC. For local development, you can run Milvus Lite (in-process) or spin up a standalone instance via Docker. The SDK provides high-level collection management, data insertion, index building, and search APIs.
# !pip install pymilvus
from pymilvus import connections, utility, FieldSchema, CollectionSchema, DataType, Collection
import numpy as np
print('β
Imports successful')
1. Connect to MilvusΒΆ
connections.connect() establishes a gRPC connection to a running Milvus instance. The alias parameter lets you manage multiple connections. The default address is localhost:19530 for a standalone Docker deployment. For Milvus Lite (no Docker), you can use MilvusClient("./milvus_local.db") which runs entirely in-process with local file storage.
connections.connect(
alias="default",
host='localhost',
port='19530'
)
print("β
Connected to Milvus")
2. Create CollectionΒΆ
Milvus requires an explicit schema with typed fields. Every collection must have a primary key field (INT64 or VARCHAR), at least one FLOAT_VECTOR field with a specified dimensionality, and optionally scalar fields for metadata. The auto_id=True option lets Milvus generate unique primary keys automatically. This schema-based approach ensures data integrity and enables efficient columnar storage under the hood.
fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=500),
FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=384)
]
schema = CollectionSchema(fields=fields, description="Documents")
collection = Collection(name="documents", schema=schema)
print("β
Collection created")
3. Insert DataΒΆ
Data is inserted as a list of column arrays (one array per field in schema order). Milvus stores vectors in a columnar format optimized for batch operations. After insertion, data is initially in a βgrowing segmentβ and becomes searchable only after the segment is sealed or you explicitly call collection.flush(). Primary keys are returned so you can reference specific records for updates or deletes.
entities = [
["Machine learning", "Deep learning", "NLP"],
[np.random.random(384).tolist() for _ in range(3)]
]
insert_result = collection.insert(entities)
print(f"β
Inserted {len(insert_result.primary_keys)} entities")
4. Create Index and SearchΒΆ
Before searching, Milvus requires you to build an index on the vector field. The index type determines the search algorithm: IVF_FLAT partitions vectors into nlist clusters and searches only the nprobe nearest clusters at query time, trading a small amount of recall for dramatic speed improvements. After building the index, you must call collection.load() to load the index into memory. The search method then returns the top limit nearest neighbors for each query vector, along with distances.
index_params = {
"metric_type": "COSINE",
"index_type": "IVF_FLAT",
"params": {"nlist": 128}
}
collection.create_index(field_name="embedding", index_params=index_params)
collection.load()
search_params = {"metric_type": "COSINE", "params": {"nprobe": 10}}
query_vector = [np.random.random(384).tolist()]
results = collection.search(
data=query_vector,
anns_field="embedding",
param=search_params,
limit=3
)
print(f"β
Search complete")