跳到主要内容

部署Etcd集群

部署Etcd集群 Etcd 是一个分布式键值存储系统,Kubernetes使用Etcd进行数据存储,所以先准备一个Etcd数据库,为解决Etcd单点故障,应采用集群方式部署,这里使用3台组建集群,可容忍1台机器故障,当然,你也可以使用5台组建集群,可容忍2台机器故障。 节点名称 IP

etcd-1	192.168.31.71
etcd-2 192.168.31.72
etcd-3 192.168.31.73
注:为了节省机器,这里与K8s节点机器复用。也可以独立于k8s集群之外部署,只要apiserver能连接到就行。
2.1 准备cfssl证书生成工具
cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。
找任意一台服务器操作,这里用Master节点。
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
2.2 生成Etcd证书
1. 自签证书颁发机构(CA)
创建工作目录:
mkdir -p ~/TLS/{etcd,k8s}

cd ~/TLS/etcd
自签CA:

cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"www": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF

cat > ca-csr.json << EOF
{
"CN": "etcd CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing"
}
]
}
EOF

生成证书:

cfssl gencert -initca ca-csr.json | cfssljson -bare ca -

会生成ca.pem和ca-key.pem文件。

  1. 使用自签CA签发Etcd HTTPS证书 创建证书申请文件:
cat > etcd-server-csr.json << EOF
{
"CN": "etcd",
"hosts": [
"192.168.1.203",
"192.168.1.204",
"192.168.1.205"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing"
}
]
}
EOF

注:上述文件hosts字段中IP为所有etcd节点的集群内部通信IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP。 生成证书:

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www etcd-server-csr.json | cfssljson -bare etcd-server

会生成etcd-server.pem和etcd-server-key.pem文件。

2.3 从Github下载二进制文件 下载地址:https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz 2.4 部署Etcd集群 以下在节点1上操作,为简化操作,待会将节点1生成的所有文件拷贝到 节点2和节点3


1. 创建工作目录并解压二进制包
mkdir /opt/etcd/{bin,cfg,ssl} -p
tar zxvf etcd-v3.4.9-linux-amd64.tar.gz
mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
2. 创建etcd配置文件

cat > /opt/etcd/cfg/etcd.conf << EOF
# This is the configuration file for the etcd server.

# Human-readable name for this member.
name: 'etcd-1'

# Path to the data directory.
data-dir: '/var/lib/etcd/default.etcd'

# Path to the dedicated wal directory.
wal-dir: '/var/lib/etcd/wal.etcd'

# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 10000

# Time (in milliseconds) of a heartbeat interval.
heartbeat-interval: 100

# Time (in milliseconds) for an election to timeout.
election-timeout: 1000

# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 0

# List of comma separated URLs to listen on for peer traffic.
listen-peer-urls: https://192.168.70.101:2380

# List of comma separated URLs to listen on for client traffic.
listen-client-urls: https://192.168.70.101:2379

# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 5

# Maximum number of wal files to retain (0 is unlimited).
max-wals: 5

# Comma-separated white list of origins for CORS (cross-origin resource sharing).
cors: '*'

# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: https://192.168.70.101:2380

# List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: https://192.168.70.101:2379

# Discovery URL used to bootstrap the cluster.
#discovery:

# Valid values include 'exit', 'proxy'
discovery-fallback: 'proxy'

# HTTP proxy to use for traffic to discovery service.
#discovery-proxy: https://192.168.70.101:2379

# DNS domain used to bootstrap initial cluster.
#discovery-srv: https://192.168.70.101:2380

# Comma separated string of initial cluster configuration for bootstrapping.
# Example: initial-cluster: "infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380"
initial-cluster: "etcd-1=https://192.168.70.101:2380,etcd-2=https://192.168.70.102:2380,etcd-3=https://192.168.70.103:2380"

# Initial cluster token for the etcd cluster during bootstrap.
initial-cluster-token: 'etcd-cluster'

# Initial cluster state ('new' or 'existing').
initial-cluster-state: 'new'

# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: false

# Enable runtime profiling data via HTTP server
enable-pprof: true
# Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
metrics: 'basic'
# Set level of detail for exported metrics, specify 'extensive' to include server side grpc histogram metrics.
#listen-metrics-urls: ''
#List of URLs to listen on for the metrics and health endpoints.


# Valid values include 'on', 'readonly', 'off'
proxy: 'off'

# Time (in milliseconds) an endpoint will be held in a failed state.
proxy-failure-wait: 5000

# Time (in milliseconds) of the endpoints refresh interval.
proxy-refresh-interval: 30000

# Time (in milliseconds) for a dial to timeout.
proxy-dial-timeout: 1000

# Time (in milliseconds) for a write to timeout.
proxy-write-timeout: 5000

# Time (in milliseconds) for a read to timeout.
proxy-read-timeout: 0

client-transport-security:
# Path to the client server TLS cert file.
cert-file: /opt/etcd/ssl/etcd-server.pem

# Path to the client server TLS key file.
key-file: /opt/etcd/ssl/etcd-server-key.pem

# Enable client cert authentication.
client-cert-auth: true

# Path to the client server TLS trusted CA cert file.
trusted-ca-file: /opt/etcd/ssl/ca.pem

# Client TLS using generated certificates
auto-tls: true

peer-transport-security:
# Path to the peer server TLS cert file.
cert-file: /opt/etcd/ssl/etcd-server.pem

# Path to the peer server TLS key file.
key-file: /opt/etcd/ssl/etcd-server-key.pem

# Enable peer client cert authentication.
client-cert-auth: true

# Path to the peer server TLS trusted CA cert file.
trusted-ca-file: /opt/etcd/ssl/ca.pem

# Peer TLS using generated certificates.
auto-tls: true

# The validity period of the self-signed certificate, the unit is year.
self-signed-cert-validity: 1

# Enable debug-level logging for etcd.
log-level: info

logger: zap

# Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd.
log-outputs: [stderr]

# Force to create a new one member cluster.
force-new-cluster: false

auto-compaction-mode: periodic
auto-compaction-retention: "1"

# Limit etcd to a specific set of tls cipher suites
cipher-suites: [
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
]
EOF


•ETCD_NAME:节点名称,集群中唯一
•ETCD_DATA_DIR:数据目录
•ETCD_LISTEN_PEER_URLS:集群通信监听地址
•ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
•ETCD_INITIAL_ADVERTISE_PEERURLS:集群通告地址
•ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
•ETCD_INITIAL_CLUSTER:集群节点地址
•ETCD_INITIALCLUSTER_TOKEN:集群Token
•ETCD_INITIALCLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群


3. systemd管理etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
#EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd --config-file=/opt/etcd/cfg/etcd.conf
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF


4. 拷贝刚才生成的证书
把刚才生成的证书拷贝到配置文件中的路径:
cp ~/TLS/etcd/ca*pem ~/TLS/etcd/server*pem /opt/etcd/ssl/
5. 启动并设置开机启动
systemctl daemon-reload
systemctl start etcd
systemctl enable etcd


6. 将上面节点1所有生成的文件拷贝到节点2和节点3
scp -r /opt/etcd/ root@192.168.31.72:/opt/
scp /usr/lib/systemd/system/etcd.service root@192.168.31.72:/usr/lib/systemd/system/
scp -r /opt/etcd/ root@192.168.31.73:/opt/
scp /usr/lib/systemd/system/etcd.service root@192.168.31.73:/usr/lib/systemd/system/
然后在节点2和节点3分别修改etcd.conf配置文件中的节点名称和当前服务器IP:
vi /opt/etcd/cfg/etcd.conf
#[Member]
ETCD_NAME="etcd-1" # 修改此处,节点2改为etcd-2,节点3改为etcd-3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.31.71:2380" # 修改此处为当前服务器IP
ETCD_LISTEN_CLIENT_URLS="https://192.168.31.71:2379" # 修改此处为当前服务器IP

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.31.71:2380" # 修改此处为当前服务器IP
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.31.71:2379" # 修改此处为当前服务器IP
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.31.71:2380,etcd-2=https://192.168.31.72:2380,etcd-3=https://192.168.31.73:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
最后启动etcd并设置开机启动,同上。

7. 查看集群状态
注意 修改ip 和 配置文件
ETCDCTL_API=3 /opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/etcd-server.pem --key=/opt/etcd/ssl/etcd-server-key.pem --endpoints="https://192.168.1.203:2379,https://192.168.1.204:2379,https://192.168.1.205:2379" endpoint health --write-out=table

+----------------------------+--------+-------------+-------+

ENDPOINTHEALTHTOOKERROR
+----------------------------+--------+-------------+-------+
https://192.168.31.71:2379true10.301506ms
---------------------------------------------
https://192.168.31.73:2379true12.87467ms
--------------------------------------------
https://192.168.31.72:2379true13.225954ms
---------------------------------------------
+----------------------------+--------+-------------+-------+
如果输出上面信息,就说明集群部署成功。
如果有问题第一步先看日志:/var/log/message 或 journalctl -u etcd

方法二

证书生成略

master01 etcd 配置文件

cat > /etc/etcd/etcd.config.yml << EOF 
name: 'master'
data-dir: /var/lib/etcd
wal-dir: /var/lib/etcd/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://192.168.10.60:2380'
listen-client-urls: 'https://192.168.10.60:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://192.168.10.60:2380'
advertise-client-urls: 'https://192.168.10.60:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'master=https://192.168.10.60:2380,master02=https://192.168.10.63:2380,master03=https://192.168.10.64:2380'
initial-cluster-token: 'etcd-k8s-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
cert-file: '/etc/kubernetes/pki/etcd/etcd.pem'
key-file: '/etc/kubernetes/pki/etcd/etcd-key.pem'
client-cert-auth: true
trusted-ca-file: '/etc/kubernetes/pki/etcd/etcd-ca.pem'
auto-tls: true
peer-transport-security:
cert-file: '/etc/kubernetes/pki/etcd/etcd.pem'
key-file: '/etc/kubernetes/pki/etcd/etcd-key.pem'
peer-client-cert-auth: true
trusted-ca-file: '/etc/kubernetes/pki/etcd/etcd-ca.pem'
auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF

systemd 文件

cat > /usr/lib/systemd/system/etcd.service << EOF

[Unit]
Description=Etcd Service
Documentation=https://coreos.com/etcd/docs/latest/
After=network.target

[Service]
Type=notify
ExecStart=/usr/local/bin/etcd --config-file=/etc/etcd/etcd.config.yml
Restart=on-failure
RestartSec=10
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
Alias=etcd3.service

EOF


mkdir -p /etc/kubernetes/pki/etcd
ln -s /etc/etcd/ssl/* /etc/kubernetes/pki/etcd/

systemctl daemon-reload
systemctl enable --now etcd.service
systemctl restart etcd.service


systemctl status etcd.service
export ETCDCTL_API=3
etcdctl --endpoints="192.168.10.60:2379,192.168.10.63:2379,192.168.10.64:2379" --cacert=/etc/kubernetes/pki/etcd/etcd-ca.pem --cert=/etc/kubernetes/pki/etcd/etcd.pem --key=/etc/kubernetes/pki/etcd/etcd-key.pem endpoint status --write-out=table