Health Checks Advanced
Implement sophisticated health checks for production monitoring.
By EMEPublished: February 20, 2025
health checksmonitoringdiagnosticsaspnet coreproduction
A Simple Analogy
Health checks are like doctor visits for your application. Regular checks catch problems early, prevent bigger issues, and keep the system running smoothly.
Why Health Checks?
- Early detection: Catch issues before users see them
- Load balancing: Exclude unhealthy instances
- Orchestration: Kubernetes respects health
- Monitoring: Dashboard alerts on failures
- Debugging: Understand system state
Basic Implementation
// Program.cs
builder.Services.AddHealthChecks()
.AddCheck<DatabaseHealthCheck>("database")
.AddCheck<CacheHealthCheck>("cache");
var app = builder.Build();
app.MapHealthChecks("/health");
Custom Health Checks
public class DatabaseHealthCheck : IHealthCheck
{
private readonly IDbContextFactory<AppDbContext> _contextFactory;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
using var dbContext = await _contextFactory.CreateDbContextAsync(cancellationToken);
// Try to execute a simple query
var canConnect = await dbContext.Database.CanConnectAsync(cancellationToken);
if (canConnect)
return HealthCheckResult.Healthy("Database connection successful");
else
return HealthCheckResult.Unhealthy("Cannot connect to database");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy(
"Database health check failed",
exception: ex);
}
}
}
public class CacheHealthCheck : IHealthCheck
{
private readonly IDistributedCache _cache;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
await _cache.SetStringAsync("health-check", "ok", cancellationToken: cancellationToken);
var value = await _cache.GetStringAsync("health-check", cancellationToken);
return value == "ok"
? HealthCheckResult.Healthy("Cache is operational")
: HealthCheckResult.Unhealthy("Cache check failed");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Cache not available", exception: ex);
}
}
}
Detailed Health Response
// Program.cs with detailed output
var options = new HealthCheckOptions
{
ResponseWriter = WriteResponse
};
app.MapHealthChecks("/health", options);
static async Task WriteResponse(HttpContext context, HealthReport report)
{
context.Response.ContentType = "application/json";
var response = new
{
status = report.Status.ToString(),
checks = report.Entries.Select(entry => new
{
name = entry.Key,
status = entry.Value.Status.ToString(),
description = entry.Value.Description,
duration = entry.Value.Duration.TotalMilliseconds
})
};
var json = JsonSerializer.Serialize(response);
await context.Response.WriteAsync(json);
}
Liveness vs Readiness
builder.Services.AddHealthChecks()
// Liveness: Is app still running?
.AddCheck("liveness", () => HealthCheckResult.Healthy(), tags: new[] { "liveness" })
// Readiness: Can app handle requests?
.AddCheck<DatabaseHealthCheck>("database", tags: new[] { "readiness" })
.AddCheck<CacheHealthCheck>("cache", tags: new[] { "readiness" });
// Separate endpoints for Kubernetes
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = registration => registration.Tags.Contains("liveness")
});
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = registration => registration.Tags.Contains("readiness")
});
Kubernetes Integration
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
template:
spec:
containers:
- name: api
image: myapi:latest
# Startup check
startupProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
# Liveness check (restart if fails)
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
# Readiness check (remove from service if fails)
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
Best Practices
- Include critical checks: Database, cache, external APIs
- Set appropriate timeouts: Don't timeout unexpectedly
- Separate concerns: Liveness vs readiness
- Exclude transient failures: Retry before reporting unhealthy
- Monitor check performance: Health checks shouldn't slow app
Related Concepts
- Metrics and monitoring
- Alerting strategies
- Graceful degradation
- Circuit breakers
Summary
Health checks enable proactive monitoring and automatic remediation. Implement comprehensive checks that give your orchestration platform visibility into application state.