Highly Available Kubernetes Clusters

Today’s post shows how to set-up a reliable, highly available distributed Kubernetes cluster. The support for running such clusters on Google Compute Engine (GCE) was added as an alpha feature in Kubernetes 1.5 release.

Motivation

We will create a Highly Available Kubernetes cluster, with master replicas and worker nodes distributed among three zones of a region. Such setup will ensure that the cluster will continue operating during a zone failure.

Setting Up HA cluster

The following instructions apply to GCE. First, we will setup a cluster that will span over one zone (europe-west1-b), will contain one master and three worker nodes and will be HA-compatible (will allow adding more master replicas and more worker nodes in multiple zones in future). To implement this, we’ll export the following environment variables:

$ export KUBERNETES_PROVIDER=gce

$ export NUM_NODES=3

$ export MULTIZONE=true

$ export ENABLE_ETCD_QUORUM_READ=true

and run kube-up script (note that the entire cluster will be initially placed in zone europe-west1-b):

$ KUBE_GCE_ZONE=europe-west1-b ./cluster/kube-up.sh

Now, we will add two additional pools of worker nodes, each of three nodes, in zones europe-west1-c and europe-west1-d (more details on adding pools of worker nodes can be find here):

$ KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=europe-west1-c ./cluster/kube-up.sh

$ KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=europe-west1-d ./cluster/kube-up.sh

To complete setup of HA cluster, we will add two master replicase, one in zone europe-west1-c, the other in europe-west1-d:

$ KUBE_GCE_ZONE=europe-west1-c KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh

$ KUBE_GCE_ZONE=europe-west1-d KUBE_REPLICATE_EXISTING_MASTER=true ./cluster/kube-up.sh

Note that adding the first replica will take longer (~15 minutes), as we need to reassign the IP of the master to the load balancer in front of replicas and wait for it to propagate (see design doc for more details).

Verifying in HA cluster works as intended

We may now list all nodes present in the cluster:

$ kubectl get nodes

NAME STATUS AGE

kubernetes-master Ready,SchedulingDisabled 48m

kubernetes-master-2d4 Ready,SchedulingDisabled 5m

kubernetes-master-85f Ready,SchedulingDisabled 32s

kubernetes-minion-group-6s52 Ready 39m

kubernetes-minion-group-cw8e Ready 48m

kubernetes-minion-group-fw91 Ready 48m

kubernetes-minion-group-h2kn Ready 31m

kubernetes-minion-group-ietm Ready 39m

kubernetes-minion-group-j6lf Ready 31m

kubernetes-minion-group-soj7 Ready 31m

kubernetes-minion-group-tj82 Ready 39m

kubernetes-minion-group-vd96 Ready 48m

As we can see, we have 3 master replicas (with disabled scheduling) and 9 worker nodes.

We will deploy a sample application (nginx server) to verify that our cluster is working correctly:

$ kubectl run nginx --image=nginx --expose --port=80

After waiting for a while, we can verify that both the deployment and the service were correctly created and are running:

$ kubectl get pods

NAME READY STATUS RESTARTS AGE

...

nginx-3449338310-m7fjm 1/1 Running 0 4s

...

$ kubectl run -i --tty test-a --image=busybox /bin/sh

If you don't see a command prompt, try pressing enter.

# wget -q -O- http://nginx.default.svc.cluster.local

...

<title>Welcome to nginx!</title>

...

Now, let’s simulate failure of one of master’s replicas by executing halt command on it (kubernetes-master-137, zone europe-west1-c):

$ gcloud compute ssh kubernetes-master-2d4 --zone=europe-west1-c

...

$ sudo halt

After a while the master replica will be marked as NotReady:

$ kubectl get nodes

NAME STATUS AGE

kubernetes-master Ready,SchedulingDisabled 51m

kubernetes-master-2d4 NotReady,SchedulingDisabled 8m

kubernetes-master-85f Ready,SchedulingDisabled 4m

...

However, the cluster is still operational. We may verify it by checking if our nginx server works correctly:

$ kubectl run -i --tty test-b --image=busybox /bin/sh

If you don't see a command prompt, try pressing enter.

# wget -q -O- http://nginx.default.svc.cluster.local

...

<title>Welcome to nginx!</title>

...

We may also run another nginx server:

$ kubectl run nginx-next --image=nginx --expose --port=80

The new server should be also working correctly:

$ kubectl run -i --tty test-c --image=busybox /bin/sh

If you don't see a command prompt, try pressing enter.

# wget -q -O- http://nginx-next.default.svc.cluster.local

...

<title>Welcome to nginx!</title>

...

Let’s now reset the broken replica:

$ gcloud compute instances start kubernetes-master-2d4 --zone=europe-west1-c

After a while, the replica should be re-attached to the cluster:

$ kubectl get nodes

NAME STATUS AGE

kubernetes-master Ready,SchedulingDisabled 57m

kubernetes-master-2d4 Ready,SchedulingDisabled 13m

kubernetes-master-85f Ready,SchedulingDisabled 9m

...

Shutting down HA cluster

To shutdown the cluster, we will first shut down master replicas in zones D and E:

$ KUBE_DELETE_NODES=false KUBE_GCE_ZONE=europe-west1-c ./cluster/kube-down.sh

$ KUBE_DELETE_NODES=false KUBE_GCE_ZONE=europe-west1-d ./cluster/kube-down.sh

Note that the second removal of replica will take longer (~15 minutes), as we need to reassign the IP of the load balancer in front of replicas to the remaining master and wait for it to propagate (see design doc for more details).

Then, we will remove the additional worker nodes from zones europe-west1-c and europe-west1-d:

$ KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=europe-west1-c ./cluster/kube-down.sh

$ KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=europe-west1-d ./cluster/kube-down.sh

And finally, we will shutdown the remaining master with the last group of nodes (zone europe-west1-b):

$ KUBE_GCE_ZONE=europe-west1-b ./cluster/kube-down.sh

Conclusions

We have shown how, by adding worker node pools and master replicas, a Highly Available Kubernetes cluster can be created. As of Kubernetes version 1.5.2, it is supported in kube-up/kube-down scripts for GCE (as alpha). Additionally, there is a support for HA cluster on AWS in kops scripts (see this article for more details).

Download Kubernetes
Get involved with the Kubernetes project on GitHub
Post questions (or answer questions) on Stack Overflow
Connect with the community on Slack
Follow us on Twitter @Kubernetesio for latest updates

--Jerzy Szczepkowski, Software Engineer, Google

Highly Available Kubernetes Clusters

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112