Skip to content

HPA & VPA 弹性伸缩

HPA(水平 Pod 自动伸缩)

根据指标自动调整 Pod 副本数。

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 50

  metrics:
  # CPU 使用率
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

  # 内存使用量
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 400Mi

  # 自定义指标(需要 custom-metrics-apiserver)
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"

  # 外部指标(如消息队列长度)
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: my-queue
      target:
        type: AverageValue
        averageValue: "100"

  # 伸缩行为控制
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60    # 扩容稳定窗口
      policies:
      - type: Pods
        value: 4                         # 每次最多扩 4 个
        periodSeconds: 60
      - type: Percent
        value: 100                       # 或每次最多扩 100%
        periodSeconds: 60
      selectPolicy: Max                  # 选择最大值策略
    scaleDown:
      stabilizationWindowSeconds: 300   # 缩容稳定窗口(防抖)
      policies:
      - type: Pods
        value: 2
        periodSeconds: 60

VPA(垂直 Pod 自动伸缩)

自动调整 Pod 的 CPU/Memory requests 和 limits。

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"  # Off | Initial | Recreate | Auto
  resourcePolicy:
    containerPolicies:
    - containerName: app
      minAllowed:
        cpu: "100m"
        memory: "128Mi"
      maxAllowed:
        cpu: "4"
        memory: "8Gi"
      controlledResources: ["cpu", "memory"]

VPA 模式说明:

  • Off:只推荐,不自动修改
  • Initial:只在 Pod 创建时应用推荐值
  • Recreate:需要时重建 Pod 应用新值
  • Auto:自动选择最佳策略

KEDA(事件驱动伸缩)

KEDA 扩展了 HPA,支持基于外部事件源(消息队列、数据库、HTTP 请求等)伸缩:

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-app-scaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicaCount: 0    # 可以缩容到 0!
  maxReplicaCount: 100
  triggers:
  # Kafka 消息队列
  - type: kafka
    metadata:
      bootstrapServers: kafka:9092
      consumerGroup: my-group
      topic: my-topic
      lagThreshold: "100"  # 每个副本处理 100 条消息

  # Redis 列表长度
  - type: redis
    metadata:
      address: redis:6379
      listName: my-queue
      listLength: "10"

  # Prometheus 指标
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      metricName: http_requests_total
      query: sum(rate(http_requests_total[2m]))
      threshold: "100"

常用操作

bash
# 查看 HPA 状态
kubectl get hpa
kubectl describe hpa my-app-hpa

# 查看 VPA 推荐值
kubectl describe vpa my-app-vpa

# 手动触发压测观察 HPA
kubectl run load-test --image=busybox --rm -it -- \
  sh -c "while true; do wget -q -O- http://my-app; done"

本站内容由 褚成志 整理编写,仅供学习参考