Karmada 多集群调度
什么是 Karmada
Karmada(Kubernetes Armada)是华为开源的多集群管理系统,提供跨集群的应用部署、调度和治理能力。
架构
Karmada 控制平面
├── Karmada API Server(兼容 K8s API)
├── Karmada Controller Manager
├── Karmada Scheduler(多集群调度)
└── etcd
成员集群
├── Cluster 1(北京)
├── Cluster 2(上海)
└── Cluster 3(广州)安装
bash
# 使用 karmadactl 安装
curl -s https://raw.githubusercontent.com/karmada-io/karmada/master/hack/install-cli.sh | sudo bash
karmadactl init \
--kube-image-mirror-country=cn \
--etcd-storage-mode=hostPath
# 注册成员集群
karmadactl join cluster1 \
--kubeconfig=/root/.kube/karmada.config \
--cluster-kubeconfig=/root/.kube/cluster1.config
# 查看集群
kubectl get clusters --kubeconfig=/root/.kube/karmada.config传播策略
yaml
# 将 Deployment 部署到多个集群
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
name: my-app-policy
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: my-app
- apiVersion: v1
kind: Service
name: my-app
placement:
# 集群亲和性
clusterAffinity:
clusterNames:
- cluster1
- cluster2
- cluster3
# 副本调度
replicaScheduling:
replicaSchedulingType: Divided
replicaDivisionPreference: Weighted
weightPreference:
staticClusterWeight:
- targetCluster:
clusterNames: [cluster1]
weight: 3
- targetCluster:
clusterNames: [cluster2]
weight: 2
- targetCluster:
clusterNames: [cluster3]
weight: 1覆盖策略
yaml
# 不同集群使用不同配置
apiVersion: policy.karmada.io/v1alpha1
kind: OverridePolicy
metadata:
name: my-app-override
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: my-app
overrideRules:
# 北京集群使用北京镜像仓库
- targetCluster:
clusterNames: [cluster1]
overriders:
imageOverrider:
- component: Registry
operator: replace
value: registry.beijing.example.com
# 上海集群使用上海镜像仓库
- targetCluster:
clusterNames: [cluster2]
overriders:
imageOverrider:
- component: Registry
operator: replace
value: registry.shanghai.example.com故障转移
yaml
# 集群故障时自动迁移
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
name: failover-policy
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: my-app
placement:
clusterAffinity:
clusterNames: [cluster1, cluster2]
replicaScheduling:
replicaSchedulingType: Duplicated # 每个集群都有完整副本
failover:
application:
decisionConditions:
tolerationSeconds: 60 # 集群不可用 60s 后触发迁移
purgeMode: Graciously
gracePeriodSeconds: 600