1. Connection Setupยถ

2. Enable pgvector Extensionยถ

The pgvector extension adds a native vector data type and similarity-search operators to PostgreSQL. It must be enabled once per database with CREATE EXTENSION IF NOT EXISTS vector. On Aurora PostgreSQL, pgvector is pre-installed but not activated by default โ€“ running the command below activates it. This is the same pattern used for other PostgreSQL extensions like PostGIS or hstore.

def get_connection():
    """Create database connection"""
    return psycopg2.connect(**DB_CONFIG)

# Connect and enable pgvector
conn = get_connection()
cur = conn.cursor()

# Enable pgvector extension
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
conn.commit()

# Verify extension
cur.execute("""
    SELECT extname, extversion 
    FROM pg_extension 
    WHERE extname = 'vector';
""")
result = cur.fetchone()

if result:
    print(f"โœ… pgvector extension enabled: version {result[1]}")
else:
    print("โŒ pgvector not found. Install it on your Aurora instance.")

cur.close()
conn.close()

3. Create Tables for Embeddingsยถ

Weโ€™ll create two example tables:

  1. documents - Store text documents with embeddings

  2. products - Store product data with embeddings

conn = get_connection()
cur = conn.cursor()

# Create documents table
cur.execute("""
    CREATE TABLE IF NOT EXISTS documents (
        id SERIAL PRIMARY KEY,
        content TEXT NOT NULL,
        metadata JSONB,
        embedding vector(384),  -- all-MiniLM-L6-v2 dimension
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
""")

# Create products table
cur.execute("""
    CREATE TABLE IF NOT EXISTS products (
        id SERIAL PRIMARY KEY,
        name VARCHAR(255) NOT NULL,
        description TEXT,
        category VARCHAR(100),
        price DECIMAL(10, 2),
        embedding vector(384),
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
""")

# Create indexes for better performance
# HNSW index (recommended for most use cases)
cur.execute("""
    CREATE INDEX IF NOT EXISTS documents_embedding_idx 
    ON documents 
    USING hnsw (embedding vector_cosine_ops);
""")

cur.execute("""
    CREATE INDEX IF NOT EXISTS products_embedding_idx 
    ON products 
    USING hnsw (embedding vector_cosine_ops);
""")

# Create regular indexes for filtering
cur.execute("CREATE INDEX IF NOT EXISTS products_category_idx ON products(category);")

conn.commit()
print("โœ… Tables and indexes created successfully")

# Show table info
cur.execute("""
    SELECT table_name, column_name, data_type 
    FROM information_schema.columns 
    WHERE table_name IN ('documents', 'products')
    ORDER BY table_name, ordinal_position;
""")

print("\nTable structure:")
for row in cur.fetchall():
    print(f"  {row[0]}.{row[1]}: {row[2]}")

cur.close()
conn.close()

4. Generate and Store Embeddingsยถ

Using Sentence Transformers (Local)ยถ

With the tables and indexes in place, we can now generate vector embeddings for our documents and insert them into Aurora. The SentenceTransformer model (all-MiniLM-L6-v2) encodes each text string into a 384-dimensional vector. We use psycopg2โ€™s execute_values for efficient batch insertion, casting each Python list to the vector type. Once stored, these embeddings are automatically indexed by the HNSW index we created earlier, making them immediately searchable.

# Initialize embedding model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

print(f"Model loaded: {model.get_sentence_embedding_dimension()} dimensions")

# Sample documents
documents = [
    {
        "content": "Amazon Aurora is a MySQL and PostgreSQL-compatible relational database.",
        "metadata": {"source": "aws_docs", "topic": "database"}
    },
    {
        "content": "Machine learning enables computers to learn from data without explicit programming.",
        "metadata": {"source": "ml_guide", "topic": "ai"}
    },
    {
        "content": "Vector embeddings represent text as dense numerical vectors.",
        "metadata": {"source": "embeddings_guide", "topic": "nlp"}
    },
    {
        "content": "pgvector is a PostgreSQL extension for storing and querying vector embeddings.",
        "metadata": {"source": "pgvector_docs", "topic": "database"}
    },
    {
        "content": "Semantic search finds documents based on meaning rather than keywords.",
        "metadata": {"source": "search_guide", "topic": "search"}
    },
    {
        "content": "AWS Lambda is a serverless compute service for running code.",
        "metadata": {"source": "aws_docs", "topic": "compute"}
    },
    {
        "content": "Natural language processing helps computers understand human language.",
        "metadata": {"source": "nlp_guide", "topic": "ai"}
    },
    {
        "content": "RAG combines retrieval and generation for better AI responses.",
        "metadata": {"source": "rag_guide", "topic": "ai"}
    }
]

# Generate embeddings
contents = [doc["content"] for doc in documents]
embeddings = model.encode(contents)

print(f"\nGenerated {len(embeddings)} embeddings")
print(f"Embedding shape: {embeddings[0].shape}")
# Insert documents with embeddings
conn = get_connection()
cur = conn.cursor()

# Clear existing data (optional)
cur.execute("TRUNCATE TABLE documents RESTART IDENTITY;")

# Prepare data for insertion
data = [
    (doc["content"], json.dumps(doc["metadata"]), embedding.tolist())
    for doc, embedding in zip(documents, embeddings)
]

# Batch insert
execute_values(
    cur,
    "INSERT INTO documents (content, metadata, embedding) VALUES %s",
    data,
    template="(%s, %s::jsonb, %s::vector)"
)

conn.commit()

# Verify insertion
cur.execute("SELECT COUNT(*) FROM documents;")
count = cur.fetchone()[0]
print(f"\nโœ… Inserted {count} documents")

# Show sample
cur.execute("""
    SELECT id, content, metadata->>'topic' as topic 
    FROM documents 
    LIMIT 3;
""")

print("\nSample documents:")
for row in cur.fetchall():
    print(f"  ID {row[0]}: {row[1][:60]}... (topic: {row[2]})")

cur.close()
conn.close()

7. Hybrid Search (Vector + Full-Text)ยถ

Blending Semantic and Keyword Signalsยถ

Pure vector search excels at understanding meaning but can miss exact keyword matches that matter (e.g., product names, error codes). PostgreSQLโ€™s built-in tsvector full-text search handles keyword and phrase matching with BM25-style ranking via ts_rank. By combining both signals โ€“ a weighted sum of cosine similarity and text rank โ€“ we get hybrid search that captures both semantic relevance and lexical precision. The vector_weight parameter controls the balance: higher values favor semantic similarity, lower values favor keyword matching.

# Add full-text search column
conn = get_connection()
cur = conn.cursor()

# Add tsvector column for full-text search
cur.execute("""
    ALTER TABLE documents 
    ADD COLUMN IF NOT EXISTS content_tsv tsvector 
    GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;
""")

# Create GIN index for full-text search
cur.execute("""
    CREATE INDEX IF NOT EXISTS documents_content_tsv_idx 
    ON documents USING gin(content_tsv);
""")

conn.commit()
print("โœ… Full-text search enabled")

cur.close()
conn.close()
def hybrid_search(query_text, keyword, top_k=3, vector_weight=0.7):
    """
    Combine vector similarity with keyword matching
    """
    query_embedding = model.encode(query_text)
    
    conn = get_connection()
    cur = conn.cursor()
    
    # Hybrid scoring: weighted combination of vector similarity and text match
    cur.execute("""
        SELECT 
            id,
            content,
            metadata,
            (
                %s * (1 - (embedding <=> %s::vector)) + 
                (1 - %s) * ts_rank(content_tsv, plainto_tsquery('english', %s))
            ) as hybrid_score,
            1 - (embedding <=> %s::vector) as vector_similarity,
            ts_rank(content_tsv, plainto_tsquery('english', %s)) as text_score
        FROM documents
        WHERE content_tsv @@ plainto_tsquery('english', %s)
           OR embedding <=> %s::vector < 0.5
        ORDER BY hybrid_score DESC
        LIMIT %s;
    """, (
        vector_weight, query_embedding.tolist(), vector_weight, keyword,
        query_embedding.tolist(), keyword, keyword, 
        query_embedding.tolist(), top_k
    ))
    
    results = cur.fetchall()
    cur.close()
    conn.close()
    
    return results

# Example hybrid search
query = "database for storing vectors"
keyword = "database"

print(f"Semantic query: '{query}'")
print(f"Keyword filter: '{keyword}'\n")

results = hybrid_search(query, keyword, top_k=3)

print("Hybrid search results:")
for i, (doc_id, content, metadata, hybrid, vec_sim, text) in enumerate(results, 1):
    print(f"\n{i}. Hybrid Score: {hybrid:.4f}")
    print(f"   Vector: {vec_sim:.4f} | Text: {text:.4f}")
    print(f"   {content}")

8. Product Search Exampleยถ

Real-World E-Commerce Use Caseยถ

Product search is one of the most impactful applications of vector databases. Customers rarely use the exact words from product descriptions โ€“ they search for โ€œsomething for exerciseโ€ when they mean โ€œrunning shoesโ€ or โ€œyoga mat.โ€ By embedding product descriptions and searching with cosine similarity, we enable intent-based discovery that goes far beyond keyword matching. Combined with SQL filters on category and price, this gives us a production-ready semantic product search engine built entirely within PostgreSQL.

# Sample products
products = [
    {"name": "Wireless Headphones", "description": "Noise-cancelling Bluetooth headphones with 30-hour battery", "category": "Electronics", "price": 149.99},
    {"name": "Running Shoes", "description": "Lightweight athletic shoes for marathon training", "category": "Sports", "price": 89.99},
    {"name": "Coffee Maker", "description": "Programmable drip coffee maker with thermal carafe", "category": "Kitchen", "price": 79.99},
    {"name": "Yoga Mat", "description": "Non-slip exercise mat for yoga and pilates", "category": "Sports", "price": 29.99},
    {"name": "Smart Watch", "description": "Fitness tracker with heart rate monitor and GPS", "category": "Electronics", "price": 299.99},
    {"name": "Blender", "description": "High-power blender for smoothies and food processing", "category": "Kitchen", "price": 99.99},
    {"name": "Laptop Bag", "description": "Water-resistant backpack with laptop compartment", "category": "Accessories", "price": 49.99},
    {"name": "Dumbbells Set", "description": "Adjustable weight dumbbells for home workouts", "category": "Sports", "price": 159.99},
]

# Generate embeddings for product descriptions
descriptions = [p["description"] for p in products]
product_embeddings = model.encode(descriptions)

# Insert products
conn = get_connection()
cur = conn.cursor()

cur.execute("TRUNCATE TABLE products RESTART IDENTITY;")

data = [
    (p["name"], p["description"], p["category"], p["price"], emb.tolist())
    for p, emb in zip(products, product_embeddings)
]

execute_values(
    cur,
    "INSERT INTO products (name, description, category, price, embedding) VALUES %s",
    data,
    template="(%s, %s, %s, %s, %s::vector)"
)

conn.commit()
print(f"โœ… Inserted {len(products)} products\n")

cur.close()
conn.close()
def product_search(query, category=None, max_price=None, top_k=3):
    """
    Search products with filters
    """
    query_embedding = model.encode(query)
    
    conn = get_connection()
    cur = conn.cursor()
    
    # Build WHERE clause
    where_clauses = []
    params = [query_embedding.tolist()]
    
    if category:
        where_clauses.append("category = %s")
        params.append(category)
    
    if max_price:
        where_clauses.append("price <= %s")
        params.append(max_price)
    
    where_sql = " AND ".join(where_clauses) if where_clauses else "1=1"
    params.extend([query_embedding.tolist(), top_k])
    
    cur.execute(f"""
        SELECT 
            name,
            description,
            category,
            price,
            1 - (embedding <=> %s::vector) as similarity
        FROM products
        WHERE {where_sql}
        ORDER BY embedding <=> %s::vector
        LIMIT %s;
    """, params)
    
    results = cur.fetchall()
    cur.close()
    conn.close()
    
    return results

# Example searches
print("Search 1: 'something for exercise'")
results = product_search("something for exercise", top_k=3)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

print("\nSearch 2: 'fitness gear under $100' (Sports category, max $100)")
results = product_search("fitness gear", category="Sports", max_price=100, top_k=3)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

print("\nSearch 3: 'gadgets for tracking health' (Electronics)")
results = product_search("gadgets for tracking health", category="Electronics", top_k=2)
for name, desc, cat, price, sim in results:
    print(f"  {name} (${price}) - Score: {sim:.3f}")

9. Performance Optimizationยถ

Index Types and Performanceยถ

As your vector collection grows, index configuration becomes critical to maintaining fast query times. pgvector offers two index types: HNSW (Hierarchical Navigable Small World) for the best query latency and recall, and IVFFlat (Inverted File with Flat quantization) for faster index builds and lower memory usage. The right choice depends on your read/write ratio and dataset size. The cells below inspect your current indexes and discuss tuning strategies.

conn = get_connection()
cur = conn.cursor()

# Check current indexes
cur.execute("""
    SELECT 
        tablename,
        indexname,
        indexdef
    FROM pg_indexes
    WHERE tablename IN ('documents', 'products')
    ORDER BY tablename, indexname;
""")

print("Current indexes:\n")
for table, index, definition in cur.fetchall():
    print(f"{table}.{index}")
    print(f"  {definition}\n")

cur.close()
conn.close()

Index Options for pgvectorยถ

1. HNSW (Hierarchical Navigable Small World) - Recommended

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
  • โœ… Fast queries

  • โœ… Good recall

  • โš ๏ธ Slower inserts

  • Best for: Production workloads

2. IVFFlat (Inverted File with Flat compression)

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
  • โœ… Faster inserts than HNSW

  • โš ๏ธ Needs training (VACUUM ANALYZE)

  • Best for: Frequently updated data

3. No Index

  • โš ๏ธ Exact search (100% recall)

  • โš ๏ธ Slow for large datasets

  • Best for: Small datasets (<10k vectors)

Distance Operatorsยถ

  • <-> L2 distance (Euclidean)

  • <#> Inner product (negative)

  • <=> Cosine distance (recommended for normalized embeddings)

10. Monitoring and Statisticsยถ

Understanding Table and Index Sizeยถ

Regular monitoring helps you spot performance regressions, plan capacity, and decide when to rebuild indexes. The query below uses PostgreSQL system catalog functions to report table sizes (data + indexes) and row counts. For production deployments, you would complement this with Aurora Performance Insights, CloudWatch metrics, and periodic EXPLAIN ANALYZE on your most common search queries to verify that the HNSW index is being used.

conn = get_connection()
cur = conn.cursor()

# Table statistics
cur.execute("""
    SELECT 
        schemaname,
        tablename,
        pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as total_size,
        pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) as table_size,
        pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) as indexes_size
    FROM pg_tables
    WHERE tablename IN ('documents', 'products');
""")

print("Table sizes:\n")
for schema, table, total, table_size, index_size in cur.fetchall():
    print(f"{table}:")
    print(f"  Total: {total}")
    print(f"  Table: {table_size}")
    print(f"  Indexes: {index_size}\n")

# Row counts
cur.execute("SELECT 'documents' as table, COUNT(*) FROM documents UNION ALL SELECT 'products', COUNT(*) FROM products;")
print("Row counts:")
for table, count in cur.fetchall():
    print(f"  {table}: {count:,} rows")

cur.close()
conn.close()

11. Best Practicesยถ

๐Ÿ”’ Securityยถ

  1. Never hardcode credentials

    # โŒ Bad
    password = "my-password"
    
    # โœ… Good
    password = os.getenv('AURORA_PASSWORD')
    
  2. Use AWS Secrets Manager (production)

    import boto3
    secrets = boto3.client('secretsmanager')
    creds = secrets.get_secret_value(SecretId='db-creds')
    
  3. Enable SSL connections

    conn = psycopg2.connect(
        **DB_CONFIG,
        sslmode='require'
    )
    

โšก Performanceยถ

  1. Use connection pooling (production)

    from psycopg2 import pool
    connection_pool = pool.SimpleConnectionPool(1, 20, **DB_CONFIG)
    
  2. Batch operations

    # โœ… Good - batch insert
    execute_values(cur, sql, data)
    
    # โŒ Bad - individual inserts
    for item in data:
        cur.execute(sql, item)
    
  3. Choose right index type

    • HNSW for read-heavy workloads

    • IVFFlat for write-heavy workloads

  4. Regular maintenance

    VACUUM ANALYZE documents;
    REINDEX INDEX documents_embedding_idx;
    

๐Ÿ’ฐ Cost Optimizationยถ

  1. Use Aurora Serverless v2 for variable workloads

  2. Monitor Aurora Performance Insights

  3. Set up read replicas for read-heavy apps

  4. Consider Aurora I/O-Optimized for high I/O

๐ŸŽฏ Data Managementยถ

  1. Normalize embeddings before storage

    from sklearn.preprocessing import normalize
    normalized = normalize(embeddings)
    
  2. Store metadata efficiently

    • Use JSONB for flexible metadata

    • Index frequently queried fields

  3. Implement soft deletes for audit trails

    ALTER TABLE documents ADD COLUMN deleted_at TIMESTAMP;
    

12. Integration with AWS Servicesยถ

Lambda Integrationยถ

# AWS Lambda function example
import json
import psycopg2
import boto3

def lambda_handler(event, context):
    # Get credentials from Secrets Manager
    secrets = boto3.client('secretsmanager')
    db_creds = json.loads(
        secrets.get_secret_value(SecretId='aurora-db')['SecretString']
    )
    
    # Connect to Aurora
    conn = psycopg2.connect(**db_creds)
    cur = conn.cursor()
    
    # Your logic here
    query = event.get('query')
    # ... perform semantic search ...
    
    return {
        'statusCode': 200,
        'body': json.dumps(results)
    }

S3 Integrationยถ

# Store original documents in S3, embeddings in Aurora
import boto3

s3 = boto3.client('s3')

# Upload document to S3
s3.put_object(
    Bucket='my-documents',
    Key=f'docs/{doc_id}.txt',
    Body=document_content
)

# Store embedding in Aurora with S3 reference
cur.execute("""
    INSERT INTO documents (content, embedding, metadata)
    VALUES (%s, %s, %s)
""", (
    document_summary,
    embedding,
    json.dumps({'s3_key': f'docs/{doc_id}.txt'})
))

13. Cleanupยถ

Removing Demo Tablesยถ

The cell below drops the documents and products tables created during this tutorial. Uncomment and run it when you are finished experimenting. In a production environment, you would use migrations or versioned DDL scripts rather than manual DROP TABLE commands.

# Optional: Clean up demo data
# Uncomment to run

# conn = get_connection()
# cur = conn.cursor()

# cur.execute("DROP TABLE IF EXISTS documents CASCADE;")
# cur.execute("DROP TABLE IF EXISTS products CASCADE;")

# conn.commit()
# print("โœ… Cleanup complete")

# cur.close()
# conn.close()

Summaryยถ

What You Learnedยถ

โœ… Setup: Connected to Aurora PostgreSQL and enabled pgvector
โœ… Storage: Created tables with vector columns and indexes
โœ… Embeddings: Generated and stored embeddings
โœ… Search: Implemented semantic, filtered, and hybrid search
โœ… Real-world: Built product search with filters
โœ… Optimization: Learned indexing strategies and best practices
โœ… AWS Integration: Connected with Lambda, S3, and Secrets Manager

Key Advantages of Aurora + pgvectorยถ

  1. Unified Data Model - Vectors + relational data in one place

  2. SQL Power - Complex queries, joins, filters, aggregations

  3. AWS Ecosystem - Native integration with all AWS services

  4. Cost Effective - No separate vector DB infrastructure

  5. Enterprise Features - Backups, replication, monitoring built-in

  6. Hybrid Search - Combine semantic + full-text + structured queries

When to Use Aurora + pgvectorยถ

โœ… Perfect for:

  • Existing PostgreSQL/Aurora users

  • AWS-native architectures

  • Hybrid data (vectors + structured)

  • Medium-scale vector search (<1M vectors)

  • Cost-conscious projects

โš ๏ธ Consider alternatives for:

  • Massive scale (>10M vectors) โ†’ Pinecone, Qdrant, Milvus

  • Ultra-low latency (<10ms) โ†’ FAISS, Qdrant

  • Vector-only workloads โ†’ Specialized vector DBs

Next Stepsยถ

  1. Scale up - Test with larger datasets

  2. Tune indexes - Optimize HNSW parameters

  3. Build RAG - Integrate with LLM applications

  4. Monitor - Use Aurora Performance Insights

  5. Compare - Try other vector databases (Chroma, Qdrant, Pinecone)

Resourcesยถ

Happy vector searching with Aurora! ๐Ÿš€