Install, upgrade, backup and restore Kubernetes Cluster using kubeadm

Prerequisites

5 min readOct 3, 2019

Three Ubuntu Servers version Ubuntu 16.04 Xenial LTS

Steps

Run below commands on all nodes

Get the Docker gpg key

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Add the Docker repository

sudo add-apt-repository    "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

Get the Kubernetes gpg key

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

Add the Kubernetes repository

cat << EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF

Update your packages

sudo apt-get update

Install Docker, kubelet, kubeadm, and kubectl:

sudo apt-get install -y docker-ce=18.06.1~ce~3-0~ubuntu kubelet=1.13.5-00 kubeadm=1.13.5-00 kubectl=1.13.5-00

Hold all installed packages to current version

sudo apt-mark hold docker-ce kubelet kubeadm kubectl

Add the iptables rule to sysctl.conf:

echo "net.bridge.bridge-nf-call-iptables=1" | sudo tee -a /etc/sysctl.conf

Enable iptables immediately:

sudo sysctl -p

Disable Swap

swapoff -a

Run following commands only on master

Initializing the Kubernetes Cluster

sudo kubeadm init --pod-network-cidr=121.244.0.0/16

Once you run the above command you will get similar to the following image. Copy the command kubeadmin join and keep it.

Setup local

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

You can use either one of the following options. If you are planning to install Istio after Kubernetes Installation please use Option 2.

Apply Flannel CNI Network overlay (Option 1)

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Apply Calico Network Overlay (Option 2)

curl https://docs.projectcalico.org/v3.9/manifests/calico.yaml -OPOD_CIDR="<your-pod-cidr>" \
sed -i -e "s?121.244.0.0/16?$POD_CIDR?g" calico.yaml
kubectl apply -f calico.yaml

Run following commands only on worker nodes

kubeadm join [your unique string from the kubeadm init command]

Remember the command printed on the screen previously which you saved. yes that one!

Verify the worker nodes have joined the cluster successfully:

kubectl get nodes

Upgrading the Kubernetes Cluster

kubeadm allows us to upgrade our cluster components in the proper order, making sure to include important feature upgrades we might want to take advantage of in the latest stable version of Kubernetes.

In this article, I will go through upgrading our cluster from version 1.13.5 to 1.14.1.

Get the version of the API server:

kubectl version --short

Release the hold on versions of kubeadm and kubelet:

sudo apt-mark unhold kubeadm kubelet

Install version 1.14.1 of kubeadm:

sudo apt install -y kubeadm=1.14.1-00

Hold the version of kubeadm at 1.14.1:

sudo apt-mark hold kubeadm

Verify the kubeadm version

kubeadm version

Plan the upgrade of all the controller components:

sudo kubeadm upgrade plan

When you run above command you should get the output similar to the below image.

Upgrade the controller components:

sudo kubeadm upgrade apply v1.14.1

Release the hold on the version of kubectl:

sudo apt-mark unhold kubectl

Upgrade kubectl:

sudo apt install -y kubectl=1.14.1-00

Hold the version of kubectl at 1.14.1:

sudo apt-mark hold kubectl

Upgrade the version of kubelet:

sudo apt install -y kubelet=1.14.1-00

Hold the version of kubelet at 1.14.1:

sudo apt-mark hold kubelet

Now repeat the above steps in all nodes. So, the whole cluster will be updated.

Operating System Upgrades within a Kubernetes Cluster

When we need to take a node down for maintenance, Kubernetes makes it easy to evict the pods on that node, take it down, and then continue scheduling pods after the maintenance is complete.

Furthermore, if the node needs to be decommissioned, you can just as easily remove the node and replace it with a new one, joining it to the cluster.

See which pods are running on which nodes:

kubectl get pods -o wide

Evict the pods on a node:

kubectl drain [node_name] --ignore-daemonsets

Watch as the node changes status:

kubectl get nodes -w

Schedule pods to the node after maintenance is complete:

kubectl uncordon [node_name]

Remove a node from the cluster:

kubectl delete node [node_name]

Adding new Node to Existing Kubernetes Cluster

Generate a new token:

sudo kubeadm token generate

List the tokens:

sudo kubeadm token list

Print the kubeadm join command to join a node to the cluster:

sudo kubeadm token create [token_name] --ttl 2h --print-join-command

Backing up Kubernetes Cluster

Backing up your cluster can be a useful, especially if you have a single etcd cluster, as all the cluster state is stored there. The etcdctl utility allows us to easily create a snapshot of our cluster state (etcd) and save this to an external location. I’ll go through creating the snapshot and talk about restoring in the event of failure.

Get the etcd binaries:

wget https://github.com/etcd-io/etcd/releases/download/v3.3.12/etcd-v3.3.12-linux-amd64.tar.gz

Unzip the compressed binaries:

tar xvf etcd-v3.3.12-linux-amd64.tar.gz

Move the files into /usr/local/bin:

sudo mv etcd-v3.3.12-linux-amd64/etcd* /usr/local/bin

Take a snapshot of the etcd datastore using etcdctl:

sudo ETCDCTL_API=3 etcdctl snapshot save snapshot.db --cacert /etc/kubernetes/pki/etcd/server.crt --cert /etc/kubernetes/pki/etcd/ca.crt --key /etc/kubernetes/pki/etcd/ca.key

View the help page for etcdctl:

ETCDCTL_API=3 etcdctl --help

Browse to the folder that contains the certificate files:

cd /etc/kubernetes/pki/etcd/

View that the snapshot was successful:

ETCDCTL_API=3 etcdctl --write-out=table snapshot status snapshot.db

Zip up the contents of the etcd directory:

sudo tar -zcvf etcd.tar.gz /etc/kubernetes/pki/etcd

Copy the etcd directory to another server:

scp etcd.tar.gz ubuntu@127.0.0.1:~/

Restoring Backups

Just the follow the steps mentioned in the following link.

https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/recovery.md