Isaac.

SQL Query Optimization

Write efficient SQL queries for maximum database performance.

By EMEPublished: February 20, 2025
sqlquery optimizationperformancedatabase tuning

A Simple Analogy

SQL optimization is like route planning. The best route reaches the destination fastest by avoiding unnecessary detours.


Why Optimize?

  • Speed: Faster response times
  • Throughput: Handle more queries
  • Resources: Less CPU and memory
  • Cost: Fewer database resources needed
  • Scalability: Better under load

Execution Plans

-- Analyze query plan
EXPLAIN ANALYZE
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.total > 100
ORDER BY o.created_at DESC;

-- Result shows:
-- - Seq Scan (slow) vs Index Scan (fast)
-- - Actual rows vs estimated
-- - Filter conditions applied where

Index Strategies

-- Single column index (frequently filtered)
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_status ON orders(status);

-- Composite index (multiple conditions)
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);

-- Covering index (includes all needed columns)
CREATE INDEX idx_orders_covering ON orders(customer_id) 
INCLUDE (id, total, status);

-- Partial index (only active records)
CREATE INDEX idx_active_orders ON orders(customer_id) 
WHERE status = 'active';

Query Patterns

-- Bad: Cartesian product
SELECT * FROM orders o, customers c
WHERE o.customer_id = c.id;

-- Good: Explicit join with needed columns
SELECT o.id, o.total, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;

-- Bad: Subquery in WHERE
SELECT * FROM orders
WHERE customer_id IN (
    SELECT id FROM customers WHERE status = 'active'
);

-- Good: JOIN
SELECT DISTINCT o.*
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE c.status = 'active';

Aggregation Optimization

-- Bad: Process all rows then filter
SELECT COUNT(*) FROM orders WHERE status = 'completed';

-- Good: Filter first, then aggregate
SELECT COUNT(*) 
FROM orders 
WHERE status = 'completed'
AND order_date > '2025-01-01';

-- Bad: Multiple aggregates in separate queries
SELECT SUM(total) FROM orders;
SELECT COUNT(*) FROM orders;

-- Good: Single query with multiple aggregates
SELECT 
    SUM(total) as total,
    COUNT(*) as count,
    AVG(total) as average
FROM orders;

Window Functions

-- Calculate running total efficiently
SELECT 
    order_id,
    amount,
    SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders
ORDER BY order_date;

-- Rank with dense rank
SELECT 
    customer_id,
    order_id,
    amount,
    DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) as rank
FROM orders;

-- Pagination with window functions
SELECT 
    order_id,
    total,
    ROW_NUMBER() OVER (ORDER BY created_at DESC) as row_num
FROM orders
WHERE ROW_NUMBER() BETWEEN 1 AND 10;

Connection Pooling

// Configure connection pool
builder.Services.AddNpgsql<AppContext>(options =>
{
    var connString = builder.Configuration.GetConnectionString("DefaultConnection");
    
    // Connection pooling
    var pgOptions = new NpgsqlDataSourceBuilder(connString)
        .DefaultCommandTimeout(TimeSpan.FromSeconds(30))
        .EnlistTransactionScope(true)
        .Build();
    
    options.UseNpgsql(pgOptions);
});

// Connection pool settings
var connectionString = "Host=localhost;Database=mydb;Username=user;Password=pass;" +
    "Pooling=true;MaxPoolSize=100;MinPoolSize=5;Connection Lifetime=300;";

Best Practices

  1. Index wisely: Only index frequently filtered columns
  2. *Avoid SELECT : Load only needed columns
  3. Use EXPLAIN: Understand query plans
  4. Batch operations: Combine multiple queries
  5. Monitor slow queries: Track performance over time

Related Concepts

  • Query hints
  • Statistics
  • Partitioning
  • Replication

Summary

SQL optimization requires understanding execution plans, strategic indexing, and writing efficient queries. Profile slow queries and apply targeted improvements.