Isaac.

Health Checks Advanced

Implement sophisticated health checks for production monitoring.

By EMEPublished: February 20, 2025
health checksmonitoringdiagnosticsaspnet coreproduction

A Simple Analogy

Health checks are like doctor visits for your application. Regular checks catch problems early, prevent bigger issues, and keep the system running smoothly.


Why Health Checks?

  • Early detection: Catch issues before users see them
  • Load balancing: Exclude unhealthy instances
  • Orchestration: Kubernetes respects health
  • Monitoring: Dashboard alerts on failures
  • Debugging: Understand system state

Basic Implementation

// Program.cs
builder.Services.AddHealthChecks()
    .AddCheck<DatabaseHealthCheck>("database")
    .AddCheck<CacheHealthCheck>("cache");

var app = builder.Build();
app.MapHealthChecks("/health");

Custom Health Checks

public class DatabaseHealthCheck : IHealthCheck
{
    private readonly IDbContextFactory<AppDbContext> _contextFactory;
    
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context,
        CancellationToken cancellationToken = default)
    {
        try
        {
            using var dbContext = await _contextFactory.CreateDbContextAsync(cancellationToken);
            
            // Try to execute a simple query
            var canConnect = await dbContext.Database.CanConnectAsync(cancellationToken);
            
            if (canConnect)
                return HealthCheckResult.Healthy("Database connection successful");
            else
                return HealthCheckResult.Unhealthy("Cannot connect to database");
        }
        catch (Exception ex)
        {
            return HealthCheckResult.Unhealthy(
                "Database health check failed",
                exception: ex);
        }
    }
}

public class CacheHealthCheck : IHealthCheck
{
    private readonly IDistributedCache _cache;
    
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context,
        CancellationToken cancellationToken = default)
    {
        try
        {
            await _cache.SetStringAsync("health-check", "ok", cancellationToken: cancellationToken);
            var value = await _cache.GetStringAsync("health-check", cancellationToken);
            
            return value == "ok"
                ? HealthCheckResult.Healthy("Cache is operational")
                : HealthCheckResult.Unhealthy("Cache check failed");
        }
        catch (Exception ex)
        {
            return HealthCheckResult.Unhealthy("Cache not available", exception: ex);
        }
    }
}

Detailed Health Response

// Program.cs with detailed output
var options = new HealthCheckOptions
{
    ResponseWriter = WriteResponse
};

app.MapHealthChecks("/health", options);

static async Task WriteResponse(HttpContext context, HealthReport report)
{
    context.Response.ContentType = "application/json";
    
    var response = new
    {
        status = report.Status.ToString(),
        checks = report.Entries.Select(entry => new
        {
            name = entry.Key,
            status = entry.Value.Status.ToString(),
            description = entry.Value.Description,
            duration = entry.Value.Duration.TotalMilliseconds
        })
    };
    
    var json = JsonSerializer.Serialize(response);
    await context.Response.WriteAsync(json);
}

Liveness vs Readiness

builder.Services.AddHealthChecks()
    // Liveness: Is app still running?
    .AddCheck("liveness", () => HealthCheckResult.Healthy(), tags: new[] { "liveness" })
    
    // Readiness: Can app handle requests?
    .AddCheck<DatabaseHealthCheck>("database", tags: new[] { "readiness" })
    .AddCheck<CacheHealthCheck>("cache", tags: new[] { "readiness" });

// Separate endpoints for Kubernetes
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = registration => registration.Tags.Contains("liveness")
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = registration => registration.Tags.Contains("readiness")
});

Kubernetes Integration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  template:
    spec:
      containers:
      - name: api
        image: myapi:latest
        
        # Startup check
        startupProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          failureThreshold: 30
        
        # Liveness check (restart if fails)
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        
        # Readiness check (remove from service if fails)
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5

Best Practices

  1. Include critical checks: Database, cache, external APIs
  2. Set appropriate timeouts: Don't timeout unexpectedly
  3. Separate concerns: Liveness vs readiness
  4. Exclude transient failures: Retry before reporting unhealthy
  5. Monitor check performance: Health checks shouldn't slow app

Related Concepts

  • Metrics and monitoring
  • Alerting strategies
  • Graceful degradation
  • Circuit breakers

Summary

Health checks enable proactive monitoring and automatic remediation. Implement comprehensive checks that give your orchestration platform visibility into application state.