ETCD Part 1: backup & restore(En)
ETCD is a crucial component in a Kubernetes cluster. It’s essential to understand how to back up and restore ETCD to ensure there’s a chance to recover in case of a cluster crash (so you don’t get in trouble with your boss).
In this article, I will be divided into three major sections:
- What is ETCD?
- ETCD backup
- ETCD restore
Let’s get started!
1. What is ETCD?
Kubernetes uses ETCD (key-value storage) to store all data, including configuration data, state, and metadata. ETCD allows all Kubernetes nodes to perform read and write operations.
In simple terms, ETCD is responsible for storing both the “current” state and the “desired” state of the system. This includes any changes to ETCD’s content when you execute commands like “kubectl get XXX” or create objects with “kubectl create XXX.”
ETCD nodes communicate using the RAFT algorithm, and a cluster requires a minimum of 3 nodes (an odd number).
You can visit the following website to get a clearer explanation of how the Leader is elected (Leader Election), how data is replicated to other nodes while maintaining consistency (Log Replication), and what problems the RAFT algorithm primarily aims to solve:
2. ETCD backup
S2–1. Obtain etcdctl utility
[master]# ETCD_RELEASE=$(curl -s https://api.github.com/repos/etcd-io/etcd/releases/latest|grep tag_name | cut -d '"' -f 4)
[master]# echo $ETCD_RELEASE
v3.5.9
[master]# wget https://github.com/etcd-io/etcd/releases/download/${ETCD_RELEASE}/etcd-${ETCD_RELEASE}-linux-amd64.tar.gz
[master]# tar zxvf etcd-v3.5.9-linux-amd64.tar.gz
[master]# cd etcd-v3.5.9-linux-amd64
[master]# ls -al
[master]# etcdctl version
S2–2. Acquiring Essential Information. In this step, you will obtain the following information through three available methods (choose any):
- etcd endpoint :
--endpoint
- ca certificate :
--cacert
- server certificate :
--cert
- server key :
--key
[Method 1]
[master]# vim /etc/kubernetes/manifests/etcd.yaml
[Method 2]
[master]# kubectl get po -n kube-system
[master]# kubectl describe pod etcd-master-node -n kube-system
[Method 3]
[master]# cat /etc/kubernetes/manifests/etcd.yaml |grep listen
[master]# cat /etc/kubernetes/manifests/etcd.yaml |grep file
S2–3. Performing Backup
[master]# ETCDCTL_API=3 etcdctl \
--endpoints=https://10.107.88.12:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /root/etcd/etcd.db
# Verify
[master]# ETCDCTL_API=3 etcdctl --write-out=table snapshot status /root/etcd/etcd.db
3. ETCD restore
The scenario for verification and restoration involves conducting a test, as follows:
- Before : The default namespace should be empty.
- Perform a backup
- Create an nginx pod in the default namespace
- Create a new directory and restore the data to the new location
- Modify the manifest to make ETCD use the new location
- After : The default namespace has returned to a state with no objects.
# The default namespace should be empty
[master]# kubectl get default
No resources found in default namespace.
# Perform a backup
[master]# ETCDCTL_API=3 etcdctl \
--endpoints=https://10.107.88.12:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /root/etcd/etcd-01.db
# Create an nginx pod in the default namespace
[master]# kubectl run testpod --image=nginx -n default
# Create a new directory and restore the data to the new location
[master]# mkdir /root/etcd-backup
[master]# ETCDCTL_API=3 etcdctl --data-dir="/root/etcd-backup" \
--endpoints=https://10.107.88.12:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot restore /root/etcd/etcd-01.db
# At this point, the state has not been restored yet
[master]# kubectl get pod -n default
NAME READY STATUS RESTARTS AGE
testpod 1/1 Running 0 6m2s
# Edit /etc/kubernetes/manifests/etcd.yaml to point to the new directory
[master]# tree /root/etcd-backup
[master]# vim /etc/kubernetes/manifests/etcd.yaml
# After saving the file, wait for a few minutes to allow ETCD to update its state (during this time, the API may not respond)
[master]# kubectl get pod -n default
The basic ETCD restoration process is now complete. In the upcoming articles, we will conduct tests for various other scenarios.
In addition to backing up ETCD, it’s advisable to utilize third-party software like Velero or similar tools to provide additional protection for your applications. It can indeed enhance the stability and robustness of your Kubernetes cluster environment.
Reference: