Kubernetes Resource Management
Configure CPU and memory limits for Kubernetes workloads.
By EMEPublished: February 20, 2025
kubernetesresourceslimitsrequestsscaling
A Simple Analogy
Kubernetes resource limits are like room reservations. You reserve what you need (request) and set a maximum you can use (limit).
Why Resource Management?
- Efficiency: Use resources effectively
- Stability: Prevent node overload
- Scaling: Enable auto-scaling
- Cost: Right-size containers
- QoS: Guarantee quality of service
Requests and Limits
apiVersion: v1
kind: Pod
metadata:
name: app-pod
spec:
containers:
- name: app
image: myapp:latest
resources:
# What the pod needs to run
requests:
memory: "128Mi" # 128 megabytes
cpu: "250m" # 250 millicores (0.25 CPU)
# Maximum it can use
limits:
memory: "512Mi" # 512 megabytes
cpu: "1000m" # 1 full CPU
Resource Units
CPU:
1 = 1 core
500m = 0.5 cores
100m = 0.1 cores
Memory:
1Mi = 1 mebibyte
1Gi = 1 gibibyte
100m = 0.1 bytes (rarely used)
Quality of Service Classes
# Guaranteed: requests == limits
apiVersion: v1
kind: Pod
metadata:
name: guaranteed-pod
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
---
# Burstable: requests < limits
apiVersion: v1
kind: Pod
metadata:
name: burstable-pod
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1000m"
---
# BestEffort: no requests or limits
apiVersion: v1
kind: Pod
metadata:
name: best-effort-pod
spec:
containers:
- name: app
image: myapp:latest
Deployment Resources
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: app
image: nginx:latest
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
Horizontal Pod Autoscaling
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
Best Practices
- Set both requests and limits: Enable scheduling
- Monitor actual usage: Right-size containers
- Use HPA: Auto-scale based on metrics
- Consider QoS: Match requirements
- Test under load: Verify resource adequacy
Related Concepts
- Vertical Pod Autoscaling
- Resource quotas
- Network policies
- Storage management
Summary
Configure requests and limits for predictable resource allocation. Use HPA for automatic scaling based on metrics like CPU and memory utilization.