Kubernetes Resource Management¶
Resource management controls how CPU and memory are allocated, consumed, and bounded across the cluster. Proper configuration prevents noisy neighbors and ensures fair resource distribution.
Requests and Limits¶
Defined per container, not per pod:
- Requests - guaranteed minimum. Scheduler uses this for placement decisions
- Limits - maximum allowed consumption
- CPU in millicores: 1000m = 1 vCPU
- Memory in Mi/Gi
Exceeding Limits¶
| Resource | Behavior | Type |
|---|---|---|
| CPU | Throttled (not killed) | Compressible - can be redistributed |
| Memory | OOMKilled | Incompressible - reclaimed only by killing |
QoS Classes¶
Assigned automatically based on requests/limits configuration. Determines eviction priority (BestEffort first, Guaranteed last):
| Class | Condition | Use Case |
|---|---|---|
| Guaranteed | All containers: limits == requests for both CPU and memory | Databases, critical services |
| Burstable | At least one container: requests < limits | Web services, variable-load apps |
| BestEffort | No requests or limits set | Test environments only |
Namespaces¶
Virtual clusters within a physical cluster. Provide isolation boundary and per-namespace resource control.
kubectl create namespace dev
kubectl apply -f manifest.yaml -n dev
kubectl get pods -n dev
kubectl delete ns dev # deletes namespace + ALL contents
Default namespaces: default, kube-system (CoreDNS, kube-proxy, metrics-server), kube-public, kube-node-lease.
Same manifest deploys to multiple namespaces without changing names - just use -n <namespace>.
LimitRange¶
Per-container policy within a namespace. Assigns defaults and enforces min/max:
apiVersion: v1
kind: LimitRange
metadata:
name: cpu-mem-limit-range
namespace: dev
spec:
limits:
- type: Container
default: # default limits
cpu: "500m"
memory: "512Mi"
defaultRequest: # default requests
cpu: "300m"
memory: "256Mi"
min:
cpu: "100m"
memory: "64Mi"
max:
cpu: "2"
memory: "2Gi"
Pods violating LimitRange are rejected at creation.
ResourceQuota¶
Namespace-wide budget for aggregate resource consumption:
apiVersion: v1
kind: ResourceQuota
metadata:
name: ns-resource-quota
namespace: dev
spec:
hard:
requests.cpu: "4"
requests.memory: "4Gi"
limits.cpu: "8"
limits.memory: "8Gi"
pods: "20"
configmaps: "10"
persistentvolumeclaims: "10"
secrets: "10"
services: "10"
New pods fail with "exceeded quota" when namespace totals are reached.
Key distinction: LimitRange = per-container policy. ResourceQuota = namespace-wide budget. They work together.
Horizontal Pod Autoscaler (HPA)¶
Scales pod replicas based on CPU/memory utilization. Requires Metrics Server.
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-demo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
- Scale-up: immediate when target exceeded
- Scale-down: 5-minute cooldown
- Requires resource requests set on containers
Cluster Autoscaler¶
Scales worker nodes based on pending pod demands:
- Provisions new nodes when pods can't be scheduled (1-3 minutes)
- Removes underutilized nodes after cool-down period
- Resource requests/limits drive the decision - large requests force new nodes faster
HPA + Cluster Autoscaler Together¶
HPA increases pods -> insufficient node resources -> cluster autoscaler adds nodes -> pods scheduled. Reverse on scale-down.
Node Selectors¶
Schedule pods to specific node pools:
spec:
nodeSelector:
nodepoolos: linux # Linux user pool
# OR
kubernetes.io/os: windows # Windows pool
# OR
type: virtual-kubelet # Virtual nodes (ACI/Fargate)
Gotchas¶
- Without requests, scheduler has no information for placement - pods land randomly
- Without limits on memory, a single container can OOMKill the entire node
- If CPU
defaultnot specified in LimitRange, system assigns 1 vCPU (1000m) per container - ResourceQuota blocks scaling if quota reached - new pods stay Pending
- Metrics Server must be running for HPA to work (default in managed K8s)
- Never delete
defaultnamespace
See Also¶
- kubernetes architecture - cluster components
- kubernetes workloads - pods, deployments
- kubernetes on aks - AKS-specific resource management
- kubernetes on eks - EKS-specific autoscaling