SQL Query Optimization
Write efficient SQL queries for maximum database performance.
By EMEPublished: February 20, 2025
sqlquery optimizationperformancedatabase tuning
A Simple Analogy
SQL optimization is like route planning. The best route reaches the destination fastest by avoiding unnecessary detours.
Why Optimize?
- Speed: Faster response times
- Throughput: Handle more queries
- Resources: Less CPU and memory
- Cost: Fewer database resources needed
- Scalability: Better under load
Execution Plans
-- Analyze query plan
EXPLAIN ANALYZE
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.total > 100
ORDER BY o.created_at DESC;
-- Result shows:
-- - Seq Scan (slow) vs Index Scan (fast)
-- - Actual rows vs estimated
-- - Filter conditions applied where
Index Strategies
-- Single column index (frequently filtered)
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_orders_status ON orders(status);
-- Composite index (multiple conditions)
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);
-- Covering index (includes all needed columns)
CREATE INDEX idx_orders_covering ON orders(customer_id)
INCLUDE (id, total, status);
-- Partial index (only active records)
CREATE INDEX idx_active_orders ON orders(customer_id)
WHERE status = 'active';
Query Patterns
-- Bad: Cartesian product
SELECT * FROM orders o, customers c
WHERE o.customer_id = c.id;
-- Good: Explicit join with needed columns
SELECT o.id, o.total, c.name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id;
-- Bad: Subquery in WHERE
SELECT * FROM orders
WHERE customer_id IN (
SELECT id FROM customers WHERE status = 'active'
);
-- Good: JOIN
SELECT DISTINCT o.*
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
WHERE c.status = 'active';
Aggregation Optimization
-- Bad: Process all rows then filter
SELECT COUNT(*) FROM orders WHERE status = 'completed';
-- Good: Filter first, then aggregate
SELECT COUNT(*)
FROM orders
WHERE status = 'completed'
AND order_date > '2025-01-01';
-- Bad: Multiple aggregates in separate queries
SELECT SUM(total) FROM orders;
SELECT COUNT(*) FROM orders;
-- Good: Single query with multiple aggregates
SELECT
SUM(total) as total,
COUNT(*) as count,
AVG(total) as average
FROM orders;
Window Functions
-- Calculate running total efficiently
SELECT
order_id,
amount,
SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders
ORDER BY order_date;
-- Rank with dense rank
SELECT
customer_id,
order_id,
amount,
DENSE_RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) as rank
FROM orders;
-- Pagination with window functions
SELECT
order_id,
total,
ROW_NUMBER() OVER (ORDER BY created_at DESC) as row_num
FROM orders
WHERE ROW_NUMBER() BETWEEN 1 AND 10;
Connection Pooling
// Configure connection pool
builder.Services.AddNpgsql<AppContext>(options =>
{
var connString = builder.Configuration.GetConnectionString("DefaultConnection");
// Connection pooling
var pgOptions = new NpgsqlDataSourceBuilder(connString)
.DefaultCommandTimeout(TimeSpan.FromSeconds(30))
.EnlistTransactionScope(true)
.Build();
options.UseNpgsql(pgOptions);
});
// Connection pool settings
var connectionString = "Host=localhost;Database=mydb;Username=user;Password=pass;" +
"Pooling=true;MaxPoolSize=100;MinPoolSize=5;Connection Lifetime=300;";
Best Practices
- Index wisely: Only index frequently filtered columns
- *Avoid SELECT : Load only needed columns
- Use EXPLAIN: Understand query plans
- Batch operations: Combine multiple queries
- Monitor slow queries: Track performance over time
Related Concepts
- Query hints
- Statistics
- Partitioning
- Replication
Summary
SQL optimization requires understanding execution plans, strategic indexing, and writing efficient queries. Profile slow queries and apply targeted improvements.