Skip to content

单节点运行etcd

1. 概述

这是一个序列总结文档。

1.1 VirtualBox虚拟机信息记录

学习etcd时,使用以下几个虚拟机:

序号虚拟机主机名IPCPU内存说明
1ansible-masteransible192.168.56.1202核4GAnsible控制节点
2ansible-node1etcd-node1192.168.56.1212核2GAnsible工作节点1
3ansible-node2etcd-node2192.168.56.1222核2GAnsible工作节点2
4ansible-node3etcd-node3192.168.56.1232核2GAnsible工作节点3

后面会编写使用ansible部署etcd集群的剧本。

操作系统说明:

sh
[root@etcd-node1 ~]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
[root@etcd-node1 ~]# hostname -I
192.168.56.121 10.0.3.15
[root@etcd-node1 ~]#

2. 单服务器部署etcd集群

参考官方文档How to Set Up a Demo etcd Cluster

  • ectd端口中是 2379 用于客户端连接,提供HTTP API服务,供客户端交互。而 2380 用于伙伴通讯。

为了能在单服务器上部署etcd集群,且运行的端口和路径不相互冲突,我们先规划一下:

序号节点名称节点运行目录IP通讯端口客户端端口
1node1/srv/etcd/node1192.168.56.1212380123791
2node2/srv/etcd/node2192.168.56.1212380223792
3node3/srv/etcd/node3192.168.56.1212380323793

然后编写三个脚本:

sh
# 创建三个启动目录
[root@etcd-node1 ~]# cd /srv/etcd
[root@etcd-node1 etcd]# mkdir -p node1 node2 node3

查看第1个启动脚本:

sh
[root@etcd-node1 etcd]# cat node1/start.sh
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=192.168.56.121
HOST_2=192.168.56.121
HOST_3=192.168.56.121
PEER_PORT_1=23801
PEER_PORT_2=23802
PEER_PORT_3=23803
API_PORT_1=23791
API_PORT_2=23792
API_PORT_3=23793
START_PATH_1=/srv/etcd/node1
START_PATH_2=/srv/etcd/node2
START_PATH_3=/srv/etcd/node3
CLUSTER=${NAME_1}=http://${HOST_1}:${PEER_PORT_1},${NAME_2}=http://${HOST_2}:${PEER_PORT_2},${NAME_3}=http://${HOST_3}:${PEER_PORT_3}
echo "CLUSTER:${CLUSTER}"

# 节点1
THIS_NAME=${NAME_1}
THIS_IP=${HOST_1}
cd ${START_PATH_1}
nohup etcd --data-dir=data.etcd --name ${THIS_NAME} \
    --initial-advertise-peer-urls http://${THIS_IP}:${PEER_PORT_1} --listen-peer-urls http://${THIS_IP}:${PEER_PORT_1} \
    --advertise-client-urls http://${THIS_IP}:${API_PORT_1} --listen-client-urls http://${THIS_IP}:${API_PORT_1} \
    --initial-cluster ${CLUSTER} \
    --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &

[root@etcd-node1 etcd]#

查看第2个启动脚本:

sh
[root@etcd-node1 etcd]# cat node2/start.sh
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=192.168.56.121
HOST_2=192.168.56.121
HOST_3=192.168.56.121
PEER_PORT_1=23801
PEER_PORT_2=23802
PEER_PORT_3=23803
API_PORT_1=23791
API_PORT_2=23792
API_PORT_3=23793
START_PATH_1=/srv/etcd/node1
START_PATH_2=/srv/etcd/node2
START_PATH_3=/srv/etcd/node3
CLUSTER=${NAME_1}=http://${HOST_1}:${PEER_PORT_1},${NAME_2}=http://${HOST_2}:${PEER_PORT_2},${NAME_3}=http://${HOST_3}:${PEER_PORT_3}
echo "CLUSTER:${CLUSTER}"

# 节点2
THIS_NAME=${NAME_2}
THIS_IP=${HOST_2}
cd ${START_PATH_2}
nohup etcd --data-dir=data.etcd --name ${THIS_NAME} \
    --initial-advertise-peer-urls http://${THIS_IP}:${PEER_PORT_2} --listen-peer-urls http://${THIS_IP}:${PEER_PORT_2} \
    --advertise-client-urls http://${THIS_IP}:${API_PORT_2} --listen-client-urls http://${THIS_IP}:${API_PORT_2} \
    --initial-cluster ${CLUSTER} \
    --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &

[root@etcd-node1 etcd]#

查看第3个启动脚本:

sh
[root@etcd-node1 etcd]# cat node3/start.sh
TOKEN=token-01
CLUSTER_STATE=new
NAME_1=node1
NAME_2=node2
NAME_3=node3
HOST_1=192.168.56.121
HOST_2=192.168.56.121
HOST_3=192.168.56.121
PEER_PORT_1=23801
PEER_PORT_2=23802
PEER_PORT_3=23803
API_PORT_1=23791
API_PORT_2=23792
API_PORT_3=23793
START_PATH_1=/srv/etcd/node1
START_PATH_2=/srv/etcd/node2
START_PATH_3=/srv/etcd/node3
CLUSTER=${NAME_1}=http://${HOST_1}:${PEER_PORT_1},${NAME_2}=http://${HOST_2}:${PEER_PORT_2},${NAME_3}=http://${HOST_3}:${PEER_PORT_3}
echo "CLUSTER:${CLUSTER}"

# 节点3
THIS_NAME=${NAME_3}
THIS_IP=${HOST_3}
cd ${START_PATH_3}
nohup etcd --data-dir=data.etcd --name ${THIS_NAME} \
    --initial-advertise-peer-urls http://${THIS_IP}:${PEER_PORT_3} --listen-peer-urls http://${THIS_IP}:${PEER_PORT_3} \
    --advertise-client-urls http://${THIS_IP}:${API_PORT_3} --listen-client-urls http://${THIS_IP}:${API_PORT_3} \
    --initial-cluster ${CLUSTER} \
    --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &

[root@etcd-node1 etcd]#

分别启动3个脚本:

sh
[root@etcd-node1 ~]# cd /srv/etcd/node1 && ./start.sh && cd -
CLUSTER:node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803
/root
[root@etcd-node1 ~]# nohup: appending output to ‘nohup.out’

[root@etcd-node1 ~]# cd /srv/etcd/node2  && ./start.sh && cd -
CLUSTER:node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803
/root
[root@etcd-node1 ~]# nohup: appending output to ‘nohup.out’

[root@etcd-node1 ~]# cd /srv/etcd/node3  && ./start.sh && cd -
CLUSTER:node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803
/root
[root@etcd-node1 ~]# nohup: appending output to ‘nohup.out’

查看etcd进程和端口监听:

sh
[root@etcd-node1 ~]# ps -ef|grep etcd
root      1584     1  0 23:24 pts/0    00:00:00 etcd --data-dir=data.etcd --name node1 --initial-advertise-peer-urls http://192.168.56.121:23801 --listen-peer-urls http://192.168.56.121:23801 --advertise-client-urls http://192.168.56.121:23791 --listen-client-urls http://192.168.56.121:23791 --initial-cluster node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803 --initial-cluster-state new --initial-cluster-token token-01
root      1590     1  0 23:25 pts/0    00:00:00 etcd --data-dir=data.etcd --name node2 --initial-advertise-peer-urls http://192.168.56.121:23802 --listen-peer-urls http://192.168.56.121:23802 --advertise-client-urls http://192.168.56.121:23792 --listen-client-urls http://192.168.56.121:23792 --initial-cluster node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803 --initial-cluster-state new --initial-cluster-token token-01
root      1602     1  0 23:25 pts/0    00:00:00 etcd --data-dir=data.etcd --name node3 --initial-advertise-peer-urls http://192.168.56.121:23803 --listen-peer-urls http://192.168.56.121:23803 --advertise-client-urls http://192.168.56.121:23793 --listen-client-urls http://192.168.56.121:23793 --initial-cluster node1=http://192.168.56.121:23801,node2=http://192.168.56.121:23802,node3=http://192.168.56.121:23803 --initial-cluster-state new --initial-cluster-token token-01
root      1610  1395  0 23:25 pts/0    00:00:00 grep --color=always etcd
[root@etcd-node1 ~]# netstat -tunlp|grep etcd
tcp        0      0 192.168.56.121:23791    0.0.0.0:*               LISTEN      1584/etcd
tcp        0      0 192.168.56.121:23792    0.0.0.0:*               LISTEN      1590/etcd
tcp        0      0 192.168.56.121:23793    0.0.0.0:*               LISTEN      1602/etcd
tcp        0      0 192.168.56.121:23801    0.0.0.0:*               LISTEN      1584/etcd
tcp        0      0 192.168.56.121:23802    0.0.0.0:*               LISTEN      1590/etcd
tcp        0      0 192.168.56.121:23803    0.0.0.0:*               LISTEN      1602/etcd
[root@etcd-node1 ~]#

同时,也可以看到,在启动目录下也生成了些数据文件:

sh
[root@etcd-node1 ~]# find /srv/etcd/node1
/srv/etcd/node1
/srv/etcd/node1/start.sh
/srv/etcd/node1/nohup.out
/srv/etcd/node1/data.etcd
/srv/etcd/node1/data.etcd/member
/srv/etcd/node1/data.etcd/member/snap
/srv/etcd/node1/data.etcd/member/snap/db
/srv/etcd/node1/data.etcd/member/wal
/srv/etcd/node1/data.etcd/member/wal/0000000000000000-0000000000000000.wal
/srv/etcd/node1/data.etcd/member/wal/0.tmp
[root@etcd-node1 ~]# find /srv/etcd/node2
/srv/etcd/node2
/srv/etcd/node2/start.sh
/srv/etcd/node2/nohup.out
/srv/etcd/node2/data.etcd
/srv/etcd/node2/data.etcd/member
/srv/etcd/node2/data.etcd/member/snap
/srv/etcd/node2/data.etcd/member/snap/db
/srv/etcd/node2/data.etcd/member/wal
/srv/etcd/node2/data.etcd/member/wal/0000000000000000-0000000000000000.wal
/srv/etcd/node2/data.etcd/member/wal/0.tmp
[root@etcd-node1 ~]# find /srv/etcd/node3
/srv/etcd/node3
/srv/etcd/node3/start.sh
/srv/etcd/node3/nohup.out
/srv/etcd/node3/data.etcd
/srv/etcd/node3/data.etcd/member
/srv/etcd/node3/data.etcd/member/snap
/srv/etcd/node3/data.etcd/member/snap/db
/srv/etcd/node3/data.etcd/member/wal
/srv/etcd/node3/data.etcd/member/wal/0000000000000000-0000000000000000.wal
/srv/etcd/node3/data.etcd/member/wal/0.tmp
[root@etcd-node1 ~]#

3. 查看集群信息

将以下命令放到~/.bashrc配置文件中,方便后续使用:

sh
export ETCDCTL_API=3
export HOST_1=192.168.56.121
export HOST_2=192.168.56.121
export HOST_3=192.168.56.121
export API_PORT_1=23791
export API_PORT_2=23792
export API_PORT_3=23793
export ENDPOINTS=${HOST_1}:${API_PORT_1},${HOST_2}:${API_PORT_2},${HOST_3}:${API_PORT_3}

然后使用source ~/.bashrc使配置生效。

生效后查看集群成员列表信息:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS member list
77d0ca9a6be501b9, started, node3, http://192.168.56.121:23803, http://192.168.56.121:23793, false
cfb64cbf48fd223d, started, node1, http://192.168.56.121:23801, http://192.168.56.121:23791, false
ea8ade60580b015e, started, node2, http://192.168.56.121:23802, http://192.168.56.121:23792, false
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS --write-out=table member list
+------------------+---------+-------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  | NAME  |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+-------+-----------------------------+-----------------------------+------------+
| 77d0ca9a6be501b9 | started | node3 | http://192.168.56.121:23803 | http://192.168.56.121:23793 |      false |
| cfb64cbf48fd223d | started | node1 | http://192.168.56.121:23801 | http://192.168.56.121:23791 |      false |
| ea8ade60580b015e | started | node2 | http://192.168.56.121:23802 | http://192.168.56.121:23792 |      false |
+------------------+---------+-------+-----------------------------+-----------------------------+------------+
[root@etcd-node1 ~]#

Snipaste_2025-02-27_23-39-59.png

此时可以看到,我们在单服务器上面部署的三个etcd。

注意

最后一列是IS LEARNER不是Is Leader,不要搞错了

查看集群endpoints状态:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS endpoint status
192.168.56.121:23791, cfb64cbf48fd223d, 3.5.18, 20 kB, false, false, 3, 15, 15, 
192.168.56.121:23792, ea8ade60580b015e, 3.5.18, 20 kB, true, false, 3, 15, 15, 
192.168.56.121:23793, 77d0ca9a6be501b9, 3.5.18, 20 kB, false, false, 3, 15, 15, 
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint status
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.56.121:23791 | cfb64cbf48fd223d |  3.5.18 |   20 kB |     false |      false |         3 |         15 |                 15 |        |
| 192.168.56.121:23792 | ea8ade60580b015e |  3.5.18 |   20 kB |      true |      false |         3 |         15 |                 15 |        |
| 192.168.56.121:23793 | 77d0ca9a6be501b9 |  3.5.18 |   20 kB |     false |      false |         3 |         15 |                 15 |        |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@etcd-node1 ~]#

查看集群健康状态:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint health
+----------------------+--------+------------+-------+
|       ENDPOINT       | HEALTH |    TOOK    | ERROR |
+----------------------+--------+------------+-------+
| 192.168.56.121:23792 |   true | 1.564431ms |       |
| 192.168.56.121:23793 |   true | 1.461979ms |       |
| 192.168.56.121:23791 |   true | 1.350891ms |       |
+----------------------+--------+------------+-------+
[root@etcd-node1 ~]#

设置和获取键值:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS put greeting "Hello, etcd"
OK
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS get greeting
greeting
Hello, etcd
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS put name "etcd"
OK
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS get name
name
etcd

可以看到,可以正常获取键值。

当尝试将节点3进程kill掉后,再查看集群端点健康信息和状态信息:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint health
{"level":"warn","ts":"2025-03-03T23:15:44.739068+0800","logger":"client","caller":"v3@v3.5.18/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000363c0/192.168.56.121:23793","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.168.56.121:23793: connect: connection refused\""}
+----------------------+--------+-------------+---------------------------+
|       ENDPOINT       | HEALTH |    TOOK     |           ERROR           |
+----------------------+--------+-------------+---------------------------+
| 192.168.56.121:23792 |   true |  5.354309ms |                           |
| 192.168.56.121:23791 |   true |  1.362836ms |                           |
| 192.168.56.121:23793 |  false | 5.00115408s | context deadline exceeded |
+----------------------+--------+-------------+---------------------------+
Error: unhealthy cluster
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint status
{"level":"warn","ts":"2025-03-03T23:15:57.936395+0800","logger":"etcd-client","caller":"v3@v3.5.18/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000363c0/192.168.56.121:23791","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing: dial tcp 192.168.56.121:23793: connect: connection refused\""}
Failed to get the status of endpoint 192.168.56.121:23793 (context deadline exceeded)
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.56.121:23791 | cfb64cbf48fd223d |  3.5.18 |   20 kB |     false |      false |         3 |         20 |                 20 |        |
| 192.168.56.121:23792 | ea8ade60580b015e |  3.5.18 |   20 kB |      true |      false |         3 |         20 |                 20 |        |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@etcd-node1 ~]#

Snipaste_2025-03-03_23-18-23.png

此时可以看到,检查到第三个节点有异常了!

再将节点3服务启动,集群状态又服务正常:

sh
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint health
+----------------------+--------+------------+-------+
|       ENDPOINT       | HEALTH |    TOOK    | ERROR |
+----------------------+--------+------------+-------+
| 192.168.56.121:23792 |   true | 1.413995ms |       |
| 192.168.56.121:23793 |   true | 1.560991ms |       |
| 192.168.56.121:23791 |   true | 3.256187ms |       |
+----------------------+--------+------------+-------+
[root@etcd-node1 ~]# etcdctl --endpoints=$ENDPOINTS -w=table endpoint status
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.56.121:23791 | cfb64cbf48fd223d |  3.5.18 |   20 kB |     false |      false |         3 |         27 |                 27 |        |
| 192.168.56.121:23792 | ea8ade60580b015e |  3.5.18 |   20 kB |      true |      false |         3 |         27 |                 27 |        |
| 192.168.56.121:23793 | 77d0ca9a6be501b9 |  3.5.18 |   20 kB |     false |      false |         3 |         27 |                 27 |        |
+----------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
[root@etcd-node1 ~]#

Gitalking ...

本首页参考 https://notes.fe-mm.com/ 配置而成