HPA & VPA 弹性伸缩
HPA(水平 Pod 自动伸缩)
根据指标自动调整 Pod 副本数。
yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 50
metrics:
# CPU 使用率
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
# 内存使用量
- type: Resource
resource:
name: memory
target:
type: AverageValue
averageValue: 400Mi
# 自定义指标(需要 custom-metrics-apiserver)
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
# 外部指标(如消息队列长度)
- type: External
external:
metric:
name: queue_messages_ready
selector:
matchLabels:
queue: my-queue
target:
type: AverageValue
averageValue: "100"
# 伸缩行为控制
behavior:
scaleUp:
stabilizationWindowSeconds: 60 # 扩容稳定窗口
policies:
- type: Pods
value: 4 # 每次最多扩 4 个
periodSeconds: 60
- type: Percent
value: 100 # 或每次最多扩 100%
periodSeconds: 60
selectPolicy: Max # 选择最大值策略
scaleDown:
stabilizationWindowSeconds: 300 # 缩容稳定窗口(防抖)
policies:
- type: Pods
value: 2
periodSeconds: 60VPA(垂直 Pod 自动伸缩)
自动调整 Pod 的 CPU/Memory requests 和 limits。
yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto" # Off | Initial | Recreate | Auto
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: "100m"
memory: "128Mi"
maxAllowed:
cpu: "4"
memory: "8Gi"
controlledResources: ["cpu", "memory"]VPA 模式说明:
Off:只推荐,不自动修改Initial:只在 Pod 创建时应用推荐值Recreate:需要时重建 Pod 应用新值Auto:自动选择最佳策略
KEDA(事件驱动伸缩)
KEDA 扩展了 HPA,支持基于外部事件源(消息队列、数据库、HTTP 请求等)伸缩:
yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: my-app-scaler
spec:
scaleTargetRef:
name: my-app
minReplicaCount: 0 # 可以缩容到 0!
maxReplicaCount: 100
triggers:
# Kafka 消息队列
- type: kafka
metadata:
bootstrapServers: kafka:9092
consumerGroup: my-group
topic: my-topic
lagThreshold: "100" # 每个副本处理 100 条消息
# Redis 列表长度
- type: redis
metadata:
address: redis:6379
listName: my-queue
listLength: "10"
# Prometheus 指标
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
query: sum(rate(http_requests_total[2m]))
threshold: "100"常用操作
bash
# 查看 HPA 状态
kubectl get hpa
kubectl describe hpa my-app-hpa
# 查看 VPA 推荐值
kubectl describe vpa my-app-vpa
# 手动触发压测观察 HPA
kubectl run load-test --image=busybox --rm -it -- \
sh -c "while true; do wget -q -O- http://my-app; done"