K8S集群v1.30.x搭建

TOC

环境架构

k8s集群v1.32.0版本默认runtime是containerd,过程都在linux系统上搭建,使用kubeadm工具。

服务器信息

主机名 操作系统 ip地址 cpu 内存
k8s-master Rocky Linux release 9.2 192.168.101.201 2核 4G
k8s-node-1 Rocky Linux release 9.2 192.168.101.202 2核 4G
k8s-node-2 Rocky Linux release 9.2 192.168.101.203 2核 4G

☆准备工作☆

需要操作的节点:除了修改主机名不同节点主机名不一样,其他操作所有节点都需要执行

温馨提示:以下主要使用dnf命令安装软件或者依赖包,作用跟yum命令一样,但是速度更快,可使用yum命令平替。

1.配置服务器信息

修改主机名

#所有修改主机名
# master节点执行
hostnamectl set-hostname k8s-master && bash
# node-1节点执行
hostnamectl set-hostname k8s-node-1 && bash
# node-2节点执行
hostnamectl set-hostname k8s-node-2 && bash

配置主机hosts文件

cat >> /etc/hosts << EOF
192.168.101.201 k8s-master
192.168.101.202 k8s-node-1
192.168.101.203 k8s-node-2
EOF

2.关闭安全机制

关闭防火墙

# 临时立即关闭
systemctl stop firewalld
# 永久关闭
systemctl disable firewalld

关闭selinux

# 临时立即关闭
setenforce 0
# 永久关闭
sed -i 's/enforcing/disabled/' /etc/selinux/config

3.关闭交换分区

# 临时立即关闭
swapoff -a
# 永久关闭
sed -ri 's/.*swap.*/#&/' /etc/fstab

4.修改内核

修改内核参数第一步

modprobe overlay
modprobe br_netfilter
modprobe ip_conntrack
# 查看详情
lsmod |egrep 'conntrack|br_netfilter'

修改内核参数第二步

cat > /etc/sysctl.d/k8s.conf << EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.netfilter.nf_conntrack_ma=2310720
vm.swppiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_uxer_instances=8192
fs.inotify.max_uxer_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
EOF
# 生效文件配置
sysctl --system
# 或者使用以下命令生效配置
sysctl -p /etc/sysctl.d/k8s.conf

想要解决ipvs模式下长连接空闲超时的问题可以加入以下内核参数

net.ipv4.tcp_keepalive_intvl = 30  
net.ipv4.tcp_keepalive_probes = 10  
net.ipv4.tcp_keepalive_time = 600

补充说明:如需启动ipv6则需要添加ipv6的配置:net.bridge.bridge-nf-call-ip6tables = 1

5.开启ipvs转发(方法可选)

ipvs是k8s支持的一种网络模式,大集群使用性能会比较好,其实部署测试环境这种小规模集群用iptables模式即可。
安装ipvsadm、ipset等依赖包

dnf install -y ipvsadm ipset sysstat conntrack libseccomp

配置ipvs配置并开启,以下两种方法任意一种即可。
第一种方法:

# 配置ipvs命令
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4

第二种方法:

# 生成ipvs配置文件
cat > /etc/modules-load.d/ipvs.conf << EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF
# 重新加载系统模块
systemctl restart systemd-modules-load.service

查看ipvs配置详情

lsmod |egrep 'ip_vs|nf_conntrack'

6.安装工具集

安装EPEL源

dnf install -y epel-release

安装一些需要使用的工具包

dnf install -y wget jq psmisc vim net-tools nfs-utils socat telnet device-mapper-persistent-data lvm2 git tar zip curl

开始安装Kubernetes集群v1.30.x

配置Docker和Kubernetes源

需要操作的节点:所有节点
配置docker或者containerd安装源

dnf -y install dnf-plugins-core
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

配置K8S的安装源,如下所示:

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.30/rpm/
enabled=1
gpgcheck=0
EOF

如果需要阿里云源,如下所示:

# 生成Kubernetes的yum源文件
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/rpm/
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.30/rpm/repodata/repomd.xml.key
EOF

清除dnf缓存,重新加载缓存

dnf clean all && dnf makecache

安装runtime

需要操作的节点:所有节点

使用containerd(方案一)

安装containerd

dnf -y install containerd.i

生成默认配置文件

containerd config default > /etc/containerd/config.toml

对于使用systemd作为init system的Linux发行版,使用systemd作为容器的cgroup driver可以确保节点在资源紧张的情况更加稳定,所以推荐将containerdcgroup driver配置为systemd。如下所示:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true
    ....

然后再为镜像仓库配置一个加速器,需要在 cri 配置块下面的 registry 配置块下面进行配置 registry.mirrors。如下所示:

[plugins."io.containerd.grpc.v1.cri"]
  ...
  # sandbox_image = "k8s.gcr.io/pause:3.5"
  sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
  ...
  [plugins."io.containerd.grpc.v1.cri".registry]
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
        endpoint = ["https://bqr1dr1n.mirror.aliyuncs.com"]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
        endpoint = ["https://registry.aliyuncs.com/k8sxio"]

启动containerd服务

systemctl enable containerd --now
# 查看服务状态
systemctl status containerd

使用docker(方案二)

Kubernetes自v1.24版本开始就将支持docker的Dockershim移除,如果需要使用docker作为Kubernetes的runtime的话,可以在kubelet和docker之间加上一个中间层cri-dockercri-docker是一个支持CRI标准的shim(垫片)。一头通过CRI跟kubelet交互,另一头跟docker api交互,从而间接的实现了kubernetes以docker作为容器运行时。
安装docker服务

dnf -y install docker-ce

配置镜像加速和设置cgroup模式

cat > /etc/docker/daemon.json << EOF
{
    "registry-mirrors": ["https://swr.cn-north-4.myhuaweicloud.com"],
    "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

启动docker服务

systemctl enable docker --now
# 查看服务状态
systemctl status docker

安装cri-docker
cri-docker安装包地址:https://github.com/Mirantis/cri-dockerd/tags
下载所需要的版本安装包并解压

tar -zxf cri-dockerd-0.3.15.amd64.tgz
cp cri-dockerd/cri-dockerd /usr/bin/

配置cri-docker的systemd服务管理文件

cat > /usr/lib/systemd/system/cri-docker.service << EOF
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
 
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
 
StartLimitBurst=3
 
StartLimitInterval=60s
 
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
 
TasksMax=infinity
Delegate=yes
KillMode=process
 
[Install]
WantedBy=multi-user.target
EOF

配置cri-docker.socket的systemd服务管理文件

cat > /usr/lib/systemd/system/cri-docker.socket << EOF
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
 
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
 
[Install]
WantedBy=sockets.target
EOF

启动cri-docker服务

systemctl daemon-reload
systemctl enable cri-docker --now
#查看状态
systemctl is-active cri-docker

安装kubeadm、kubelet、kubectl

需要操作的节点:所有节点
搜索查看可用的版本,可以加上grep来搜索想要安装的更详细的小版本

[root@master ~]# yum list kubeadm --showduplicates |sort -r |grep 1.30.14
kubeadm.x86_64             1.30.14-150500.1.1             kubernetes
kubeadm.src                1.30.14-150500.1.1             kubernetes
kubeadm.s390x              1.30.14-150500.1.1             kubernetes
kubeadm.ppc64le            1.30.14-150500.1.1             kubernetes
kubeadm.aarch64            1.30.14-150500.1.1             kubernetes

开始安装kubeadmkubeletkubectl工具,只有主节点需要全部安装,工作节点不需要安装kubectl工具
主节点:

dnf install -y kubeadm kubelet kubectl

工作节点:

dnf install -y kubeadm kubelet

补充说明:如果需要安装指定小版本,工具后面指定小版本号即可,比如:dnf install -y kubeadm-1.30.14-150500.1.1,版本号需要跟全。

准备镜像(方法可选)

需要操作的节点:所有节点
因为防止有的镜像我们直接部署会拉取不下来,所以我们提前准备好镜像拉取到每个节点上。
查看v1.30.6版本所需要的镜像

[root@master ~]# kubeadm config images list --kubernetes-version=v1.30.14
registry.k8s.io/kube-apiserver:v1.30.14
registry.k8s.io/kube-controller-manager:v1.30.14
registry.k8s.io/kube-scheduler:v1.30.14
registry.k8s.io/kube-proxy:v1.30.14
registry.k8s.io/coredns/coredns:v1.11.3
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.15-0

kubeadm拉取(方案一)

使用kubeamd工具拉取镜像,需要修改镜像仓库方便快速拉取

kubeadm config images pull --image-repository registry.aliyuncs.com/google_containers

补充说明:如果kubeadm生成了集群配置文件的话,可以通过kubeadm config images pull --config kubeadm.yaml命令来拉取配置文件中指定仓库的所需相关镜像。

自写脚本拉取(方案二)

或者我们使用脚本进行拉取,脚本k8s-images-pull.sh如下

#!/bin/bash
images=(
kube-apiserver:v1.30.14
kube-controller-manager:v1.30.14
kube-scheduler:v1.30.14
kube-proxy:v1.30.14
pause:3.9
etcd:3.5.15-0
coredns:v1.11.3
)
for imagename in ${images[@]} ; do
nerdctl pull registry.aliyuncs.com/google_containers/$imagename
done

添加执行权限并运行脚本拉取镜像

chmod +x k8s-images-pull.sh
./k8s-images-pull.sh

初始化主节点(方法可选)

需要操作的节点:主节点

命令初始化(方案一)

使用kubeadm命令初始化集群

kubeadm init \
--apiserver-advertise-address=192.168.101.201 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.30.14 \
--service-cidr=172.16.10.0/12 \
--pod-network-cidr=10.10.0.0/16

配置文件初始化(方案二)

我们可以通过下面的命令在 master 节点上输出集群初始化默认使用的配置

kubeadm config print init-defaults --component-configs KubeletConfiguration > kubeadm.yaml

根据我们自己的需求修改配置,比如:imageRepository指定所需镜像、networking.podSubnet指定pod分配的网段、设置kube-porxy的模式为ipvs等等。
kubeadm.yaml文件内容如下:

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
  - groups:
      - system:bootstrappers:kubeadm:default-node-token
    token: abcdef.0123456789abcdef
    ttl: 24h0m0s
    usages:
      - signing
      - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.101.201 # 指定master节点内网IP
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock # 使用 containerd的Unix socket 地址
  imagePullPolicy: IfNotPresent
  name: master
  taints: # 给master添加污点,master节点不能调度应用
    - effect: 'NoSchedule'
      key: 'node-role.kubernetes.io/master'

---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式

---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers  # 镜像地址
kind: ClusterConfiguration
kubernetesVersion: 1.30.14
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12  # 指定service子网网段
  podSubnet: 10.244.0.0/16     # 指定pod子网网段
scheduler: {}

---
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 0s
    enabled: true
  x509:
    clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 0s
    cacheUnauthorizedTTL: 0s
clusterDNS:
  - 10.96.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
cgroupDriver: systemd   # 配置cgroup driver
logging: {}
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s

初始化成功之后

初始化成功后会出现以下信息:

......
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.101.201:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:ca0c87226c69309d7779096c15b6a41e14b077baf4650bfdb6f9d3178d4da645

配置kubectl客户端工具,上面有具体的命令,可以在master节点进行配置。

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

根据上面的提示,我们工作节点可以通过kubeadm join 192.168.101.201:6443 --token ...命令进行加入集群。

工作节点加入集群

需要操作的节点:工作节点
执行在master生成的加入集群命令

kubeadm join 192.168.101.201:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:ca0c87226c69309d7779096c15b6a41e14b077baf4650bfdb6f9d3178d4da645

安装网络插件

需要操作的节点:安装配置kubectl服务器(主节点)
安装flannel网络插件

[root@master ~]# wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 如果有节点是多网卡,则需要在资源清单文件中指定内网网卡
# 搜索到名为 kube-flannel-ds 的 DaemonSet,在kube-flannel容器下面
[root@master ~]# vi kube-flannel.yml
......
containers:
- name: kube-flannel
  image: quay.io/coreos/flannel:v0.15.0
  command:
  - /opt/bin/flanneld
  args:
  - --ip-masq
  - --kube-subnet-mgr
  - --iface=eth0  # 如果是多网卡的话,指定内网网卡的名称

或者使用以下文件,kube-flannel.yml文件内容:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  seLinux:
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni-plugin
        image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
        command:
        - cp
        args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        volumeMounts:
        - name: cni-plugin
          mountPath: /opt/cni/bin
      - name: install-cni
        image: rancher/mirrored-flannelcni-flannel:v0.18.1
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: rancher/mirrored-flannelcni-flannel:v0.18.1
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
        - name: xtables-lock
          mountPath: /run/xtables.lock
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni-plugin
        hostPath:
          path: /opt/cni/bin
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate

最后运行apply命令部署插件

kubectl apply -f kube-flannel.yml

测试集群

在安装了kubectl客户端的服务器上进行访问测试,我们在master节点上安装,就在master节点上做访问测试。
查看集群的状态:

[root@master ~]# kuectl get nodes
NAME           STATUS       ROLES                  AGE     VERSION
k8s-master     Ready        control-plane,master   2m35s   v1.30.14
k8s-node-1     Ready        <none>                 45s     v1.30.14
k8s-node-2     Ready        <none>                 12s     v1.30.14