Run this notebook: Open in Colab Open in Kaggle

1. Connection Setup¶

Environment Variables (Recommended)¶

Store your credentials securely:

# Create .env file (never commit this!)
export AURORA_HOST="your-cluster.cluster-xxx.us-east-1.rds.amazonaws.com"
export AURORA_PORT="5432"
export AURORA_DATABASE="vectordb"
export AURORA_USER="postgres"
export AURORA_PASSWORD="your-secure-password"

import os
import psycopg2
from psycopg2.extras import execute_values
import numpy as np
from sentence_transformers import SentenceTransformer
import json

# Connection parameters
# Option 1: From environment variables (recommended)
DB_CONFIG = {
    'host': os.getenv('AURORA_HOST', 'your-cluster.cluster-xxx.us-east-1.rds.amazonaws.com'),
    'port': os.getenv('AURORA_PORT', '5432'),
    'database': os.getenv('AURORA_DATABASE', 'vectordb'),
    'user': os.getenv('AURORA_USER', 'postgres'),
    'password': os.getenv('AURORA_PASSWORD', 'your-password')
}

# Option 2: AWS Secrets Manager (production recommended)
# import boto3
# secrets = boto3.client('secretsmanager', region_name='us-east-1')
# secret = json.loads(secrets.get_secret_value(SecretId='aurora-db-credentials')['SecretString'])
# DB_CONFIG = secret

print("Configuration loaded (password hidden)")
print(f"Host: {DB_CONFIG['host']}")
print(f"Database: {DB_CONFIG['database']}")

2. Enable pgvector Extension¶

The pgvector extension adds a native vector data type and similarity-search operators to PostgreSQL. It must be enabled once per database with CREATE EXTENSION IF NOT EXISTS vector. On Aurora PostgreSQL, pgvector is pre-installed but not activated by default – running the command below activates it. This is the same pattern used for other PostgreSQL extensions like PostGIS or hstore.

def get_connection():
    """Create database connection"""
    return psycopg2.connect(**DB_CONFIG)

# Connect and enable pgvector
conn = get_connection()
cur = conn.cursor()

# Enable pgvector extension
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
conn.commit()

# Verify extension
cur.execute("""
    SELECT extname, extversion 
    FROM pg_extension 
    WHERE extname = 'vector';
""")
result = cur.fetchone()

if result:
    print(f"✅ pgvector extension enabled: version {result[1]}")
else:
    print("❌ pgvector not found. Install it on your Aurora instance.")

cur.close()
conn.close()

3. Create Tables for Embeddings¶

We’ll create two example tables:

documents - Store text documents with embeddings
products - Store product data with embeddings

conn = get_connection()
cur = conn.cursor()

# Create documents table
cur.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        id SERIAL PRIMARY KEY,
        content TEXT NOT NULL,
        metadata JSONB,
        embedding vector(384),  -- all-MiniLM-L6-v2 dimension
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
""")

# Create products table
cur.execute("""
    CREATE TABLE IF NOT EXISTS products (
        id SERIAL PRIMARY KEY,
        name VARCHAR(255) NOT NULL,
        description TEXT,
        category VARCHAR(100),
        price DECIMAL(10, 2),
        embedding vector(384),
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
""")

# Create indexes for better performance
# HNSW index (recommended for most use cases)
cur.execute("""
    CREATE INDEX IF NOT EXISTS documents_embedding_idx 
    ON documents 
    USING hnsw (embedding vector_cosine_ops);
""")

cur.execute("""
    CREATE INDEX IF NOT EXISTS products_embedding_idx 
    ON products 
    USING hnsw (embedding vector_cosine_ops);
""")

# Create regular indexes for filtering
cur.execute("CREATE INDEX IF NOT EXISTS products_category_idx ON products(category);")

conn.commit()
print("✅ Tables and indexes created successfully")

# Show table info
cur.execute("""
    SELECT table_name, column_name, data_type 
    FROM information_schema.columns 
    WHERE table_name IN ('documents', 'products')
    ORDER BY table_name, ordinal_position;
""")

print("\nTable structure:")
for row in cur.fetchall():
    print(f"  {row[0]}.{row[1]}: {row[2]}")

cur.close()
conn.close()

4. Generate and Store Embeddings¶

Using Sentence Transformers (Local)¶

With the tables and indexes in place, we can now generate vector embeddings for our documents and insert them into Aurora. The SentenceTransformer model (all-MiniLM-L6-v2) encodes each text string into a 384-dimensional vector. We use psycopg2’s execute_values for efficient batch insertion, casting each Python list to the vector type. Once stored, these embeddings are automatically indexed by the HNSW index we created earlier, making them immediately searchable.

# Initialize embedding model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

print(f"Model loaded: {model.get_sentence_embedding_dimension()} dimensions")

# Sample documents
documents = [
    {
        "content": "Amazon Aurora is a MySQL and PostgreSQL-compatible relational database.",
        "metadata": {"source": "aws_docs", "topic": "database"}
    },
    {
        "content": "Machine learning enables computers to learn from data without explicit programming.",
        "metadata": {"source": "ml_guide", "topic": "ai"}
    },
    {
        "content": "Vector embeddings represent text as dense numerical vectors.",
        "metadata": {"source": "embeddings_guide", "topic": "nlp"}
    },
    {
        "content": "pgvector is a PostgreSQL extension for storing and querying vector embeddings.",
        "metadata": {"source": "pgvector_docs", "topic": "database"}
    },
    {
        "content": "Semantic search finds documents based on meaning rather than keywords.",
        "metadata": {"source": "search_guide", "topic": "search"}
    },
    {
        "content": "AWS Lambda is a serverless compute service for running code.",
        "metadata": {"source": "aws_docs", "topic": "compute"}
    },
    {
        "content": "Natural language processing helps computers understand human language.",
        "metadata": {"source": "nlp_guide", "topic": "ai"}
    },
    {
        "content": "RAG combines retrieval and generation for better AI responses.",
        "metadata": {"source": "rag_guide", "topic": "ai"}
    }
]

# Generate embeddings
contents = [doc["content"] for doc in documents]
embeddings = model.encode(contents)

print(f"\nGenerated {len(embeddings)} embeddings")
print(f"Embedding shape: {embeddings[0].shape}")

# Insert documents with embeddings
conn = get_connection()
cur = conn.cursor()

# Clear existing data (optional)
cur.execute("TRUNCATE TABLE documents RESTART IDENTITY;")

# Prepare data for insertion
data = [
    (doc["content"], json.dumps(doc["metadata"]), embedding.tolist())
    for doc, embedding in zip(documents, embeddings)
]

# Batch insert
execute_values(
    cur,
    "INSERT INTO documents (content, metadata, embedding) VALUES %s",
    data,
    template="(%s, %s::jsonb, %s::vector)"
)

conn.commit()

# Verify insertion
cur.execute("SELECT COUNT(*) FROM documents;")
count = cur.fetchone()[0]
print(f"\n✅ Inserted {count} documents")

# Show sample
cur.execute("""
    SELECT id, content, metadata->>'topic' as topic 
    FROM documents 
    LIMIT 3;
""")

print("\nSample documents:")
for row in cur.fetchall():
    print(f"  ID {row[0]}: {row[1][:60]}... (topic: {row[2]})")

cur.close()
conn.close()

5. Semantic Search¶

Simple Similarity Search¶

Semantic search with pgvector uses the <=> operator for cosine distance (where 0 is identical and 2 is maximally dissimilar). We compute 1 - distance to get a cosine similarity score between 0 and 1. The query embeds the search text with the same Sentence Transformers model, then orders all documents by their cosine distance to the query vector. Because we have an HNSW index, this search is approximate but very fast – even for millions of rows, latency stays in the low milliseconds.

def semantic_search(query_text, top_k=3):
    """
    Perform semantic search using cosine similarity
    """
    # Generate query embedding
    query_embedding = model.encode(query_text)
    
    conn = get_connection()
    cur = conn.cursor()
    
    # Search using cosine distance (1 - cosine_similarity)
    cur.execute("""
        SELECT 
            id,
            content,
            metadata,
            1 - (embedding <=> %s::vector) as similarity
        FROM documents
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """, (query_embedding.tolist(), query_embedding.tolist(), top_k))
    
    results = cur.fetchall()
    
    cur.close()
    conn.close()
    
    return results

# Test semantic search
query = "How do I store vectors in a database?"
print(f"Query: '{query}'\n")

results = semantic_search(query, top_k=3)

print("Top results:")
for i, (doc_id, content, metadata, similarity) in enumerate(results, 1):
    print(f"\n{i}. Similarity: {similarity:.4f}")
    print(f"   Content: {content}")
    print(f"   Metadata: {metadata}")

# Try different queries
queries = [
    "What is machine learning?",
    "AWS cloud services",
    "How to search by meaning instead of keywords?"
]

for query in queries:
    print(f"\n{'='*60}")
    print(f"Query: '{query}'")
    print('='*60)
    
    results = semantic_search(query, top_k=2)
    
    for i, (doc_id, content, metadata, similarity) in enumerate(results, 1):
        print(f"\n{i}. Score: {similarity:.4f} | {content[:80]}...")

6. Filtered Semantic Search¶

Combining Vector Similarity with SQL Constraints¶

Filtered semantic search is where Aurora with pgvector truly differentiates itself from standalone vector databases. Because vectors live in regular PostgreSQL tables alongside structured columns, you can add arbitrary SQL WHERE clauses – filtering by metadata JSONB fields, date ranges, user permissions, or any other column – and PostgreSQL’s query planner will combine the HNSW index scan with standard B-tree index lookups. This eliminates the need for post-filtering (which wastes recall) or separate metadata stores.

def filtered_search(query_text, topic_filter=None, top_k=3):
    """
    Semantic search with metadata filtering
    """
    query_embedding = model.encode(query_text)
    
    conn = get_connection()
    cur = conn.cursor()
    
    if topic_filter:
        # Filter by topic in metadata
        cur.execute("""
            SELECT 
                id,
                content,
                metadata,
                1 - (embedding <=> %s::vector) as similarity
            FROM documents
            WHERE metadata->>'topic' = %s
            ORDER BY embedding <=> %s::vector
            LIMIT %s;
        """, (query_embedding.tolist(), topic_filter, query_embedding.tolist(), top_k))
    else:
        # No filter
        cur.execute("""
            SELECT 
                id,
                content,
                metadata,
                1 - (embedding <=> %s::vector) as similarity
            FROM documents
            ORDER BY embedding <=> %s::vector
            LIMIT %s;
        """, (query_embedding.tolist(), query_embedding.tolist(), top_k))
    
    results = cur.fetchall()
    cur.close()
    conn.close()
    
    return results

# Example: Search only in AI-related documents
query = "understanding data patterns"
print(f"Query: '{query}'")
print(f"Filter: topic = 'ai'\n")

results = filtered_search(query, topic_filter='ai', top_k=3)

print("Filtered results:")
for i, (doc_id, content, metadata, similarity) in enumerate(results, 1):
    print(f"\n{i}. Score: {similarity:.4f}")
    print(f"   {content}")
    print(f"   Topic: {metadata.get('topic')}")

7. Hybrid Search (Vector + Full-Text)¶

Blending Semantic and Keyword Signals¶

Pure vector search excels at understanding meaning but can miss exact keyword matches that matter (e.g., product names, error codes). PostgreSQL’s built-in tsvector full-text search handles keyword and phrase matching with BM25-style ranking via ts_rank. By combining both signals – a weighted sum of cosine similarity and text rank – we get hybrid search that captures both semantic relevance and lexical precision. The vector_weight parameter controls the balance: higher values favor semantic similarity, lower values favor keyword matching.

# Add full-text search column
conn = get_connection()
cur = conn.cursor()

# Add tsvector column for full-text search
cur.execute("""
    ALTER TABLE documents 
    ADD COLUMN IF NOT EXISTS content_tsv tsvector 
    GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;
""")

# Create GIN index for full-text search
cur.execute("""
    CREATE INDEX IF NOT EXISTS documents_content_tsv_idx 
    ON documents USING gin(content_tsv);
""")

conn.commit()
print("✅ Full-text search enabled")

cur.close()
conn.close()

def hybrid_search(query_text, keyword, top_k=3, vector_weight=0.7):
    """
    Combine vector similarity with keyword matching
    """
    query_embedding = model.encode(query_text)
    
    conn = get_connection()
    cur = conn.cursor()
    
    # Hybrid scoring: weighted combination of vector similarity and text match
    cur.execute("""
        SELECT 
            id,
            content,
            metadata,
            (
                %s * (1 - (embedding <=> %s::vector)) + 
                (1 - %s) * ts_rank(content_tsv, plainto_tsquery('english', %s))
            ) as hybrid_score,
            1 - (embedding <=> %s::vector) as vector_similarity,
            ts_rank(content_tsv, plainto_tsquery('english', %s)) as text_score
        FROM documents
        WHERE content_tsv @@ plainto_tsquery('english', %s)
           OR embedding <=> %s::vector < 0.5
        ORDER BY hybrid_score DESC
        LIMIT %s;
    """, (
        vector_weight, query_embedding.tolist(), vector_weight, keyword,
        query_embedding.tolist(), keyword, keyword, 
        query_embedding.tolist(), top_k
    ))
    
    results = cur.fetchall()
    cur.close()
    conn.close()
    
    return results

# Example hybrid search
query = "database for storing vectors"
keyword = "database"

print(f"Semantic query: '{query}'")
print(f"Keyword filter: '{keyword}'\n")

results = hybrid_search(query, keyword, top_k=3)

print("Hybrid search results:")
for i, (doc_id, content, metadata, hybrid, vec_sim, text) in enumerate(results, 1):
    print(f"\n{i}. Hybrid Score: {hybrid:.4f}")
    print(f"   Vector: {vec_sim:.4f} | Text: {text:.4f}")
    print(f"   {content}")

8. Product Search Example¶

Real-World E-Commerce Use Case¶

Product search is one of the most impactful applications of vector databases. Customers rarely use the exact words from product descriptions – they search for “something for exercise” when they mean “running shoes” or “yoga mat.” By embedding product descriptions and searching with cosine similarity, we enable intent-based discovery that goes far beyond keyword matching. Combined with SQL filters on category and price, this gives us a production-ready semantic product search engine built entirely within PostgreSQL.

# Sample products
products = [
    {"name": "Wireless Headphones", "description": "Noise-cancelling Bluetooth headphones with 30-hour battery", "category": "Electronics", "price": 149.99},
    {"name": "Running Shoes", "description": "Lightweight athletic shoes for marathon training", "category": "Sports", "price": 89.99},
    {"name": "Coffee Maker", "description": "Programmable drip coffee maker with thermal carafe", "category": "Kitchen", "price": 79.99},
    {"name": "Yoga Mat", "description": "Non-slip exercise mat for yoga and pilates", "category": "Sports", "price": 29.99},
    {"name": "Smart Watch", "description": "Fitness tracker with heart rate monitor and GPS", "category": "Electronics", "price": 299.99},
    {"name": "Blender", "description": "High-power blender for smoothies and food processing", "category": "Kitchen", "price": 99.99},
    {"name": "Laptop Bag", "description": "Water-resistant backpack with laptop compartment", "category": "Accessories", "price": 49.99},
    {"name": "Dumbbells Set", "description": "Adjustable weight dumbbells for home workouts", "category": "Sports", "price": 159.99},
]

# Generate embeddings for product descriptions
descriptions = [p["description"] for p in products]
product_embeddings = model.encode(descriptions)

# Insert products
conn = get_connection()
cur = conn.cursor()

cur.execute("TRUNCATE TABLE products RESTART IDENTITY;")

data = [
    (p["name"], p["description"], p["category"], p["price"], emb.tolist())
    for p, emb in zip(products, product_embeddings)
]

execute_values(
    cur,
    "INSERT INTO products (name, description, category, price, embedding) VALUES %s",
    data,
    template="(%s, %s, %s, %s, %s::vector)"
)

conn.commit()
print(f"✅ Inserted {len(products)} products\n")

cur.close()
conn.close()

def product_search(query, category=None, max_price=None, top_k=3):
    """
    Search products with filters
    """
    query_embedding = model.encode(query)
    
    conn = get_connection()
    cur = conn.cursor()
    
    # Build WHERE clause
    where_clauses = []
    params = [query_embedding.tolist()]
    
    if category:
        where_clauses.append("category = %s")
        params.append(category)
    
    if max_price:
        where_clauses.append("price <= %s")
        params.append(max_price)
    
    where_sql = " AND ".join(where_clauses) if where_clauses else "1=1"
    params.extend([query_embedding.tolist(), top_k])
    
    cur.execute(f"""
        SELECT 
            name,
            description,
            category,
            price,
            1 - (embedding <=> %s::vector) as similarity
        FROM products
        WHERE {where_sql}
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """, params)
    
    results = cur.fetchall()
    cur.close()
    conn.close()
    
    return results

# Example searches
print("Search 1: 'something for exercise'")
results = product_search("something for exercise", top_k=3)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

print("\nSearch 2: 'fitness gear under $100' (Sports category, max $100)")
results = product_search("fitness gear", category="Sports", max_price=100, top_k=3)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

print("\nSearch 3: 'gadgets for tracking health' (Electronics)")
results = product_search("gadgets for tracking health", category="Electronics", top_k=2)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

9. Performance Optimization¶

Index Types and Performance¶

As your vector collection grows, index configuration becomes critical to maintaining fast query times. pgvector offers two index types: HNSW (Hierarchical Navigable Small World) for the best query latency and recall, and IVFFlat (Inverted File with Flat quantization) for faster index builds and lower memory usage. The right choice depends on your read/write ratio and dataset size. The cells below inspect your current indexes and discuss tuning strategies.

conn = get_connection()
cur = conn.cursor()

# Check current indexes
cur.execute("""
    SELECT 
        tablename,
        indexname,
        indexdef
    FROM pg_indexes
    WHERE tablename IN ('documents', 'products')
    ORDER BY tablename, indexname;
""")

print("Current indexes:\n")
for table, index, definition in cur.fetchall():
    print(f"{table}.{index}")
    print(f"  {definition}\n")

cur.close()
conn.close()

Index Options for pgvector¶

1. HNSW (Hierarchical Navigable Small World) - Recommended

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

✅ Fast queries
✅ Good recall
⚠️ Slower inserts
Best for: Production workloads

2. IVFFlat (Inverted File with Flat compression)

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

✅ Faster inserts than HNSW
⚠️ Needs training (VACUUM ANALYZE)
Best for: Frequently updated data

3. No Index

⚠️ Exact search (100% recall)
⚠️ Slow for large datasets
Best for: Small datasets (<10k vectors)

Distance Operators¶

<-> L2 distance (Euclidean)
<#> Inner product (negative)
<=> Cosine distance (recommended for normalized embeddings)

10. Monitoring and Statistics¶

Understanding Table and Index Size¶

Regular monitoring helps you spot performance regressions, plan capacity, and decide when to rebuild indexes. The query below uses PostgreSQL system catalog functions to report table sizes (data + indexes) and row counts. For production deployments, you would complement this with Aurora Performance Insights, CloudWatch metrics, and periodic EXPLAIN ANALYZE on your most common search queries to verify that the HNSW index is being used.

conn = get_connection()
cur = conn.cursor()

# Table statistics
cur.execute("""
    SELECT 
        schemaname,
        tablename,
        pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
        pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) as table_size,
        pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) as indexes_size
    FROM pg_tables
    WHERE tablename IN ('documents', 'products');
""")

print("Table sizes:\n")
for schema, table, total, table_size, index_size in cur.fetchall():
    print(f"{table}:")
    print(f"  Total: {total}")
    print(f"  Table: {table_size}")
    print(f"  Indexes: {index_size}\n")

# Row counts
cur.execute("SELECT 'documents' as table, COUNT(*) FROM documents UNION ALL SELECT 'products', COUNT(*) FROM products;")
print("Row counts:")
for table, count in cur.fetchall():
    print(f"  {table}: {count:,} rows")

cur.close()
conn.close()

11. Best Practices¶

🔒 Security¶

Never hardcode credentials

# ❌ Bad
password = "my-password"

# ✅ Good
password = os.getenv('AURORA_PASSWORD')

Use AWS Secrets Manager (production)

import boto3
secrets = boto3.client('secretsmanager')
creds = secrets.get_secret_value(SecretId='db-creds')

Enable SSL connections

conn = psycopg2.connect(
    **DB_CONFIG,
    sslmode='require'
)

⚡ Performance¶

Use connection pooling (production)

from psycopg2 import pool
connection_pool = pool.SimpleConnectionPool(1, 20, **DB_CONFIG)

Batch operations

# ✅ Good - batch insert
execute_values(cur, sql, data)

# ❌ Bad - individual inserts
for item in data:
    cur.execute(sql, item)

Choose right index type
- HNSW for read-heavy workloads
- IVFFlat for write-heavy workloads

Regular maintenance

VACUUM ANALYZE documents;
REINDEX INDEX documents_embedding_idx;

💰 Cost Optimization¶

Use Aurora Serverless v2 for variable workloads
Monitor Aurora Performance Insights
Set up read replicas for read-heavy apps
Consider Aurora I/O-Optimized for high I/O

🎯 Data Management¶

Normalize embeddings before storage

from sklearn.preprocessing import normalize
normalized = normalize(embeddings)

Store metadata efficiently
- Use JSONB for flexible metadata
- Index frequently queried fields

Implement soft deletes for audit trails

ALTER TABLE documents ADD COLUMN deleted_at TIMESTAMP;

12. Integration with AWS Services¶

Lambda Integration¶

# AWS Lambda function example
import json
import psycopg2
import boto3

def lambda_handler(event, context):
    # Get credentials from Secrets Manager
    secrets = boto3.client('secretsmanager')
    db_creds = json.loads(
        secrets.get_secret_value(SecretId='aurora-db')['SecretString']
    )
    
    # Connect to Aurora
    conn = psycopg2.connect(**db_creds)
    cur = conn.cursor()
    
    # Your logic here
    query = event.get('query')
    # ... perform semantic search ...
    
    return {
        'statusCode': 200,
        'body': json.dumps(results)
    }

S3 Integration¶

# Store original documents in S3, embeddings in Aurora
import boto3

s3 = boto3.client('s3')

# Upload document to S3
s3.put_object(
    Bucket='my-documents',
    Key=f'docs/{doc_id}.txt',
    Body=document_content
)

# Store embedding in Aurora with S3 reference
cur.execute("""
    INSERT INTO documents (content, embedding, metadata)
    VALUES (%s, %s, %s)
""", (
    document_summary,
    embedding,
    json.dumps({'s3_key': f'docs/{doc_id}.txt'})
))

13. Cleanup¶

Removing Demo Tables¶

The cell below drops the documents and products tables created during this tutorial. Uncomment and run it when you are finished experimenting. In a production environment, you would use migrations or versioned DDL scripts rather than manual DROP TABLE commands.

# Optional: Clean up demo data
# Uncomment to run

# conn = get_connection()
# cur = conn.cursor()

# cur.execute("DROP TABLE IF EXISTS documents CASCADE;")
# cur.execute("DROP TABLE IF EXISTS products CASCADE;")

# conn.commit()
# print("✅ Cleanup complete")

# cur.close()
# conn.close()

Summary¶

What You Learned¶

✅ Setup: Connected to Aurora PostgreSQL and enabled pgvector
✅ Storage: Created tables with vector columns and indexes
✅ Embeddings: Generated and stored embeddings
✅ Search: Implemented semantic, filtered, and hybrid search
✅ Real-world: Built product search with filters
✅ Optimization: Learned indexing strategies and best practices
✅ AWS Integration: Connected with Lambda, S3, and Secrets Manager

Key Advantages of Aurora + pgvector¶

Unified Data Model - Vectors + relational data in one place
SQL Power - Complex queries, joins, filters, aggregations
AWS Ecosystem - Native integration with all AWS services
Cost Effective - No separate vector DB infrastructure
Enterprise Features - Backups, replication, monitoring built-in
Hybrid Search - Combine semantic + full-text + structured queries

When to Use Aurora + pgvector¶

✅ Perfect for:

Existing PostgreSQL/Aurora users
AWS-native architectures
Hybrid data (vectors + structured)
Medium-scale vector search (<1M vectors)
Cost-conscious projects

⚠️ Consider alternatives for:

Massive scale (>10M vectors) → Pinecone, Qdrant, Milvus
Ultra-low latency (<10ms) → FAISS, Qdrant
Vector-only workloads → Specialized vector DBs

Next Steps¶

Scale up - Test with larger datasets
Tune indexes - Optimize HNSW parameters
Build RAG - Integrate with LLM applications
Monitor - Use Aurora Performance Insights
Compare - Try other vector databases (Chroma, Qdrant, Pinecone)

Resources¶

Happy vector searching with Aurora! 🚀