网站跑在运行 Fedora + K8s + CRIO的树莓派集群上

成功把整个网站放到自己的树莓派集群上了,现在有数据在自己手上(物理的)的感觉了。

树莓派4-4G X 4的集群

自己动手搞了个架子,然后弄了几个开关(下方四个小开关)方便断路,并给每个开关加了个自恢复保险防止手滑短路,并且可以增加仪式感。启动集群的时候一个开关一个开关地打开感觉自己在启动核反应堆😁,不过这个24小时常开的,基本上用不太到开关功能。自恢复保险确实起了一次作用,在面包版插元件不小心短路了,直接断开。

搭建基本参照K8s官网,用的Kubeadm启动集群,Calico作为CNI。然后左上方路由器跑的HAproxy通过autossh穿透到外网,然后反代到Nginx Ingress,CEPH作为存储(另有一个现成的CEPH集群),(恩,在那个CEPH集群上再跑K8s更合理一些毕竟性能好,但我就是觉得树莓派在日常工作的桌子旁边跑更好玩…)。

内核需要自己编译,然后配置引导,参考树莓派官网,uboot形式暂时走不通。Gentoo上有很好的关于如何boot一个64bit kernel的说明。其他也遇到不少坑,其中就包括Fedora里的K8s太过陈旧,正在帮maintainer更新包发个新Release,PR已提,快了™。

在那之前我自己弄了个COPR:https://copr.fedorainfracloud.org/coprs/kasong/kubernetes/

其他大部分坑都包含在这个playbook里,可以参考,请不要直接用,IP地址域名都不通用:

- - hosts: all
  tasks:
  - name: Enable crio 1.17 dnf module
    command: "dnf module enable cri-o:1.17/default -y"

  - name: Basic K8s package
    yum:
      name: "{{ packages }}"
      state: installed
    vars:
      packages:
      - kubernetes-node
      - kubernetes-client
      - kubernetes-kubeadm
      - kubernetes-master
      - cri-tools
      - crio

  - name: Ensure overlayfs, br_netfilter is loaded
    lineinfile:
      path: "/etc/modules-load.d/kubernetes-crio.conf"
      line: "{{ item }}"
      create: yes
    with_items:
      - br_netfilter
      - overlay

  - name: Enable Kluster firewall zone
    firewalld:
      zone: pi-cluster
      state: present
      permanent: yes

  - name: Enable 192.168.2.0/24 to Kluster zone
    firewalld:
      source: 192.168.2.0/24
      zone: pi-cluster
      state: enabled
      permanent: yes

  - name: Enable Kluster firewall zone
    firewalld:
      zone: intra-intra-net
      state: present
      permanent: yes

  - name: Enable 192.168.2.0/24 to Intra zone
    firewalld:
      source: 192.168.12.0/24
      zone: intra-intra-net
      state: enabled
      permanent: yes

  - name: Enable 192.168.2.0/24 to Intra zone
    firewalld:
      source: 192.168.0.0/24
      zone: intra-intra-net
      state: enabled
      permanent: yes

  - name: Enable K8s API port to Intra
    firewalld:
      port: 6443/tcp
      permanent: yes
      zone: intra-intra-net
      state: enabled

  - name: Enable K8s API port to Intra
    firewalld:
      service: ssh
      permanent: yes
      zone: intra-intra-net
      state: enabled

  - name: Enable NodePort Services to Intra
    firewalld:
      port: 30000-32767/tcp
      permanent: yes
      zone: intra-intra-net
      state: enabled

  - name: Enable crio
    service:
      name: crio
      enabled: yes
      state: started

  - name: Enable kubelet
    service:
      name: kubelet
      enabled: yes

#  - name: Disable ZRAM
#    service:
#      name: zram-swap
#      enabled: no
#      state: stopped

  - name: Enable K8s API port
    firewalld:
      port: 6443/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable etcd ports
    firewalld:
      port: 2379-2380/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable mdns port
    firewalld:
      service: mdns
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable kubelet port
    firewalld:
      port: 10250/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable kube-scheduler port
    firewalld:
      port: 10251/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable kube-controller-manager port
    firewalld:
      port: 10252/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable NodePort Services port
    firewalld:
      port: 30000-32767/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: Enable BGP ports
    firewalld:
      port: 179/tcp
      permanent: yes
      zone: pi-cluster
      state: enabled

  - name: QUIRK Firewalld disable nft
    lineinfile:
      dest: "/etc/firewalld/firewalld.conf"
      regexp: '^FirewallBackend=.*'
      line: 'FirewallBackend=iptables'

  - name: Start/Restart Firewalld
    service:
      name: firewalld
      state: restarted

  - name: QUIRK Remove invalid Kubelet config option
    replace:
      path: /etc/systemd/system/kubelet.service.d/kubeadm.conf
      regexp: '^(.+)--allow-privileged=([^ "]*)(.*)'
      replace: '\1\3'

  - name: QUIRK Kubelet don't depend on docker
    lineinfile:
      dest: "/usr/lib/systemd/system/kubelet.service"
      regexp: '^(Requires=docker.service)$'
      line: '# \1'
      backrefs: yes

  - name: Tell NetworkManager not to control calico interface
    blockinfile:
      create: yes
      path: /etc/NetworkManager/conf.d/calico.conf
      block: |
        [keyfile]
        unmanaged-devices=interface-name:cali*;interface-name:tunl*

  - name: Start/Restart NetworkManager
    service:
      name: NetworkManager
      state: restarted

  - name: K8s kernel modules
    blockinfile:
      create: yes
      path: /etc/modules-load.d/kubernetes-crio.conf
      block: |
        br_netfilter
        overlay

  - name: K8s kernel sysctl
    blockinfile:
      create: yes
      path: /etc/sysctl.d/99-calico.conf
      block: |
        net.ipv4.conf.all.rp_filter = 1

  - name: K8s kernel sysctl
    blockinfile:
      create: yes
      path: /etc/sysctl.d/99-kubernetes-cri.conf
      block: |
        net.bridge.bridge-nf-call-iptables  = 1
        net.ipv4.ip_forward                 = 1
        net.bridge.bridge-nf-call-ip6tables = 1

  - name: Update Kubelet config
    lineinfile:
      dest: "/etc/systemd/system/kubelet.service.d/kubeadm.conf"
      line: 'Environment="KUBELET_CRIO_ARGS=--container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock"'
      insertbefore: "ExecStart=.*"

  - name: Update Kubelet config
    lineinfile:
      dest: "/etc/systemd/system/kubelet.service.d/kubeadm.conf"
      regexp: '^ExecStart=(((?!\$KUBELET_CRIO_ARGS).)+)$'
      line: 'ExecStart=\1 $KUBELET_CRIO_ARGS'
      backrefs: yes

  - name: Update CRIO config for insecure repo
    ini_file:
      dest: "/etc/crio/crio.conf"
      section: "crio.image"
      option: 'insecure_registries'
      value: '[ "registry.intra.intra-net.com" ]'

  - name: Update common container config for insecure repo
    ini_file:
      dest: "/etc/containers/registries.conf"
      section: "registries.insecure"
      option: 'registries'
      value: '[ "registry.intra.intra-net.com" ]'

  - name: Ensure /opt/cni exists for symlink creation in next step
    file:
      path: "/opt/cni"
      state: directory

  - name: Create symbolic link for CNI Plugin
    file:
      src: "/usr/libexec/cni"
      dest: "/opt/cni/bin"
      state: link

  - name: "Now run 'sudo kubeadm init --cri-socket /var/run/crio/crio.sock --pod-network-cidr=10.24.0.0/16'"
    debug:

# Misc TODO: Apply a Docker mirror
#
# Misc TODO: Apply following config for calico yaml
#  - name: IP
#    value: "autodetect"
#  - name: IP_AUTODETECTION_METHOD
#    value: "interface=eth.*"
#  - name: IP6_AUTODETECTION_METHOD
#    value: "interface=eth.*"
#
# Misc TODO: Black list vc4 if kernel is failing
# echo blacklist vc4 > /etc/modprobe.d/blacklist-vc4.conf
#
# Misc TODO: Still need old fashion cgroup
# cgroup_enable=memory systemd.unified_cgroup_hierarchy=0
#
# Misc TODO: Ensure following kernel configs are enabled
# CONFIG_NETFILTER_XT_MATCH_CGROUP
# CONFIG_F2FS_FS_SECURITY
# CONFIG_CFS_BANDWIDTH
# CONFIG_BLK_DEV_RBD
# CONFIG_BRIDGE_NETFILTER
#
# Misc TODO: Tune etcd
# - name: ETCD_MAX_WALS
#   value: "5"
# - name: ETCD_HEARTBEAT_INTERVAL
#   value: "500"
# - name: ETCD_ELECTION_TIMEOUT
#   value: "10000"
# - name: ETCD_SNAPSHOT_COUNT
#   value: "5000"
# - name: ETCD_LOG_LEVEL
#   value: "error"

正如所见,还有一堆TODO没写成playbook形式,回头需要再搭一遍的时候再说了XD。很多坑我会尝试在打包环节修一下,playbook中的一部分就不再需要了。