This how-to is based on this guide: smallab-k8s-pve-guide (fork as backup), combined with multiple other online sources. Some deviations are Debian 12 instead of 11 and Proxmox 8 instead of 7, multiple Proxmox nodes with SDN instead of a single node and the addition of (shared, persistent) NFS storage on an external TrueNAS machine.
For more information on why and how I came to this particular setup, read the article on my blog: https://blog.joeplaa.com/highly-available-kubernetes-cluster-on-proxmox.
This how-to describes the main differences with smallab-k8s-pve-guide (fork as backup).
Linux Bridge
.q35
and cpu flags md-clear
, pcid
, ssbd
, pdpe1gb
and aes
(all supported on my Ivy Bridge and Skylake cpu's).non-free
should be changed to non-free non-free-firmware
.k3smaster0x
and my agents k3sworker0x
./etc/rancher/k3s/config.yaml
.v1.28.3+k3s2
kubectl
with Snap on my Ubuntu machine: sudo snap install kubectl
This guide starts with G017.
Datacenter
--> SDN
--> Zones
click the Add
button to create a simple
zone.
ID
: k8s
MTU
: 1460
Nodes
: select the nodes that run the VM'sIPAM
: pve
automatic DHCP
Datacenter
--> SDN
--> VNets
click the Create
buttonin the VNets
section on the left hand side.
Name
: k8s
Alias
: k8s_internal
Zone
: k8s
Tag
: leave emptyVLAN Aware
: leave uncheckedCreate
button in the Subnets
section on the right hand side.
Subnet
: 10.33.70.1/24
Gateway
: 10.33.70.1
SNAT
: leave uncheckedDNS Zone Prefix
: leave emptyDHCP Ranges
tab add a range of 10.33.70.200
to 10.33.70.249
Datacenter
--> SDN
hit the Apply
button.Create 6 virtual machines (3 master nodes and 3 worker nodes) based on the template created in step G024.
k3smaster0x
, agents k3sworker0x
10.33.50.51-10.33.50.53
for the masters10.33.70.51-10.33.70.53
for the masters10.33.50.61-10.33.50.63
for the workers10.33.70.61-10.33.70.63
for the workers505x
, 506x
Customize hostname on each machine:
sudo hostnamctl set-hostname k3smaster0x
sudo nano /etc/hosts
Assign IP addresses on each machine:
net0
address is assigned through DHCP using a static mapping in pfSensenet1
address is assigned statically in /etc/network/interfaces
sudo nano /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).
source /etc/network/interfaces.d/*
# The loopback network interface
auto lo
iface lo inet loopback
# The primary network interface
allow-hotplug enp6s18
iface enp6s18 inet dhcp
# Secondary interface for k8s communication
allow-hotplug enp6s19
iface enp6s19 inet static
address 10.33.70.51
netmask 255.255.255.0
Apply network changes:
sudo ifdown enp6s19
RTNETLINK answers: Cannot assign requested address
sudo ifup enp6s19
Install k3s part 1: Create config files
Follow the guide, but edit the config.yaml
files:
First master:
# k3smaster01
cluster-domain: "cluster.joeplaa.local"
tls-san:
- "k3smaster01.joeplaa.cloud"
- "k3smaster01"
flannel-backend: host-gw
flannel-iface: "enp6s19"
bind-address: "0.0.0.0"
https-listen-port: 6443
advertise-address: "10.33.70.51"
advertise-port: 6443
node-ip: "10.33.70.51"
node-external-ip: "10.33.50.51"
node-taint:
- "k3s-controlplane=true:NoExecute"
log: "/var/log/k3s.log"
kubelet-arg: "config=/etc/rancher/k3s/kubelet.config"
disable:
- metrics-server
- servicelb
- traefik
protect-kernel-defaults: true
secrets-encryption: true
agent-token: "SomeReallyLongPassword"
cluster-init: true
Additional masters:
# k3smaster0x
cluster-domain: "cluster.joeplaa.local"
tls-san:
- "k3smaster0x.joeplaa.cloud"
- "k3smaster0x"
flannel-backend: host-gw
flannel-iface: "enp6s19"
bind-address: "0.0.0.0"
https-listen-port: 6443
advertise-address: "10.33.70.5x"
advertise-port: 6443
node-ip: "10.33.70.5x"
node-external-ip: "10.33.50.5x"
node-taint:
- "k3s-controlplane=true:NoExecute"
log: "/var/log/k3s.log"
kubelet-arg: "config=/etc/rancher/k3s/kubelet.config"
disable:
- metrics-server
- servicelb
- traefik
protect-kernel-defaults: true
secrets-encryption: true
agent-token: "SamePasswordAsInTheFirstServer"
server: "https://10.33.70.51:6443"
token: "K10<sha256 sum of cluster CA certificate>::server:<password>"
You can get the token by running sudo cat /var/lib/rancher/k3s/server/token
on the first master.
Workers:
# k3sworker0x
flannel-iface: "enp6s19"
node-ip: "10.33.70.6x"
node-external-ip: "10.33.50.6x"
server: "https://10.33.70.51:6443"
token: "K10<sha256 sum of server node CA certificate>::node:<SomeReallyLongPassword>"
log: "/var/log/k3s.log"
kubelet-arg: "config=/etc/rancher/k3s/kubelet.config"
protect-kernel-defaults: true
You can get the sha256 sum by running sudo sha256sum /var/lib/rancher/k3s/server/tls/server-ca.crt
on the first master. SomeReallyLongPassword
is the same as used in the masters.
Install k3s part 2. Install on first master
Follow the guide, but edit the command to use the latest k3s version:
wget -qO - https://get.k3s.io | INSTALL_K3S_VERSION="v1.28.4+k3s2" sh -s - server
Create local config file:
nano ~/.kube/config
Copy and paste content from /etc/rancher/k3s/k3s.yaml
into ~/.kube/config:
sudo cat /etc/rancher/k3s/k3s.yaml
Install k3s part 3. Install on additional masters and workers
On your local machine watch your cluster:
watch -n 1 kubectl get nodes
Add the additional masters one by one:
wget -qO - https://get.k3s.io | INSTALL_K3S_VERSION="v1.28.4+k3s2" sh -s - server
Add the workers one by one:
wget -qO - https://get.k3s.io | INSTALL_K3S_VERSION="v1.28.4+k3s2" sh -s - agent
After a while, the cluster should look something like this (obviously the age will be minutes/seconds, not days, I just happen to document this 4/5 days later):
NAME STATUS ROLES AGE VERSION
k3smaster01 Ready control-plane,etcd,master 4d17h v1.28.3+k3s2
k3smaster02 Ready control-plane,etcd,master 4d16h v1.28.3+k3s2
k3smaster03 Ready control-plane,etcd,master 4d17h v1.28.3+k3s2
k3sworker01 Ready <none> 4d15h v1.28.3+k3s2
k3sworker02 Ready <none> 4d16h v1.28.3+k3s2
k3sworker03 Ready <none> 4d16h v1.28.3+k3s2
To make our life easier, we can use Ansible to run commands on multiple machines automatically. For example, instead of logging in to each machine (6) and running sudo apt upate
6 times, we can use one single ansible
command.
Create hosts.ini
containing the hostnames of your cluster machines, and the user you use to log into those machines:
[master]
k3smaster01
k3smaster02
k3smaster03
[master:vars]
ansible_user=ansible
ansible_ssh_private_key_file=~/.ssh/jpl-k3s
[worker]
k3sworker01
k3sworker02
k3sworker03
[worker:vars]
ansible_user=ansible
ansible_ssh_private_key_file=~/.ssh/jpl-k3s
[k3s_cluster:children]
master
worker
Create ansible.cfg
:
[defaults]
inventory = hosts.ini
Create secrets.txt
:
yourSudoPasswordForAnsibleUser
Install Ansible (Installing and upgrading Ansible with pip):
python3 -m pip install --user ansible
python3 -m pip install --user ansible-core
Upgrade:
python3 -m pip install --upgrade --user ansible
Test access by pinging machines:
ansible all -m ping -v
Test sudo access by update apt cache on all machines:
ansible all -m apt -a "update_cache=yes cache_valid_time=86400" --become --become-password-file=secrets.txt
Create values.yaml
:
controller:
image:
repository: sonatype.jodibooks.com/metallb/controller
Run Helm commands:
helm repo add metallb https://metallb.github.io/metallb
helm repo update
helm install --create-namespace -n metallb-system metallb metallb/metallb -f values.yaml
Create resources/.ipaddresspool.yaml
:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: default-pool
namespace: metallb-system
spec:
addresses:
- 10.33.50.80-10.33.50.99
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: l2-advert
namespace: metallb-system
spec:
ipAddressPools:
- default-pool
Check if the metallb-controller-xxxxxxxx
is up and running:
kubectl get pods -n metallb-system
Apply resources:
kubectl apply -f resources/ipaddresspool.yaml
If you get a
context deadline exceeded
error, runkubectl delete validatingwebhookconfigurations.admissionregistration.k8s.io metallb-webhook-configuration
and try again. Source: https://github.com/metallb/metallb/issues/1597#issuecomment-1571473129
values.yaml
:image:
repository: sonatype.jodibooks.com/metrics-server/metrics-server
replicas: 2
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
helm repo add helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm repo update
helm install -n kube-system metrics-server metrics-server/metrics-server -f values.yaml
Provision datasets and nfs shares on TrueNAS.
Change owner to nobody:nogroup
Change attributes to 777
(don't like this, but couldn't get it to work otherwise and I'm not the only one: https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/issues?q=permission+denied, https://stackoverflow.com/questions/50156124/kubernetes-nfs-persistent-volumes-permission-denied)
Install nfs-common
dependency on the worker nodes:
ansible worker -m apt -a "update_cache=yes cache_valid_time=86400" --become --become-password-file=secrets.txt
ansible worker -m apt -a "name=nfs-common state=present" --become --become-password-file=secrets.txt
Create values-ssd.yaml
file:
nfs:
server: 10.33.50.10
storageClass:
name: nfs-client-ssd
reclaimPolicy: Retain
provisionerName: k8s-sigs.io/nfs-subdir-external-provisioner-ssd
accessModes: ReadWriteMany
Create values-hdd.yaml
file:
nfs:
server: 10.33.50.10
storageClass:
name: nfs-client-hdd
reclaimPolicy: Retain
provisionerName: k8s-sigs.io/nfs-subdir-external-provisioner-hdd
accessModes: ReadWriteMany
Add Helm repository:
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
helm repo update
Install nfs providers; create the storage classes:
helm install nfs-subdir-external-provisioner-ssd nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--create-namespace \
--namespace nfs-provisioner \
-f values-ssd.yaml
helm install nfs-subdir-external-provisioner-hdd nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--namespace nfs-provisioner \
-f values-hdd.yaml
Check created storage classes:
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 4d18h
nfs-client-hdd k8s-sigs.io/nfs-subdir-external-provisioner-hdd Retain Immediate true 2d17h
nfs-client-ssd k8s-sigs.io/nfs-subdir-external-provisioner-ssd Retain Immediate true 2d17h
Test nfs volume claims:
Create 2 PVC's (PersistentVolumeClaim):
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim-ssd
spec:
storageClassName: nfs-client-ssd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim-hdd
spec:
storageClassName: nfs-client-hdd
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
Create 2 pods:
kind: Pod
apiVersion: v1
metadata:
name: test-pod-ssd
spec:
containers:
- name: test-pod-ssd
image: busybox:stable
command:
- "/bin/sh"
args:
- "-c"
- "touch /mnt/SUCCESS && exit 0 || exit 1"
volumeMounts:
- name: nfs-pvc
mountPath: "/mnt"
restartPolicy: "Never"
volumes:
- name: nfs-pvc
persistentVolumeClaim:
claimName: test-claim-ssd
kind: Pod
apiVersion: v1
metadata:
name: test-pod-hdd
spec:
containers:
- name: test-pod-hdd
image: busybox:stable
command:
- "/bin/sh"
args:
- "-c"
- "touch /mnt/SUCCESS && exit 0 || exit 1"
volumeMounts:
- name: nfs-pvc
mountPath: "/mnt"
restartPolicy: "Never"
volumes:
- name: nfs-pvc
persistentVolumeClaim:
claimName: test-claim-hdd
Create claims and pods:
kubectl create -f test-claim-ssd.yaml -f test-pod-ssd.yaml
kubectl create -f test-claim-hdd.yaml -f test-pod-hdd.yaml
Check claims:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-claim-hdd Bound pvc-2acdf737-7384-405e-882d-e6f3d4bea946 1Mi RWX nfs-client-hdd 12m
test-claim-ssd Bound pvc-5d71d148-5864-49bd-8516-0c3b1177c633 1Mi RWX nfs-client-ssd 12m
Check your storage; a file SUCCESS
should be created:
```shellsession
root@JPL-TRUENAS:/mnt/pool1/k8s/default-test-claim-ssd-pvc-5d71d148-5864-49bd-8516-0c3b1177c633# ls -l
total 1
-rw-r--r-- 1 nobody nogroup 0 Dec 9 10:22 SUCCESS
```
```shellsession
kubectl delete -f test-claim-ssd.yaml -f test-pod-ssd.yaml
kubectl delete -f test-claim-hdd.yaml -f test-pod-hdd.yaml
```
reclaimPolicy: Retain
the data shouldn't have been deleted.k3s data is stored on the main OS drive in /var/lib/rancher
. This folder can grow quickly with container volumes and images. As this is ephemeral storage, it doesn't need to be backup'ed together with the OS. Therefor we are going to move this storage to a dedicated virtual disk which we exclude from the backup in Proxmox.
Format disks to ext4
. First get the name with lsblk
; probably sdb
and sdc
. Then format to ext4
:
lsblk
sudo mkfs -t ext4 /dev/sdb
Find UUID:
lsblk -f
Add directory::
sudo mkdir /mnt/k3sdata
Add entry in /etc/fstab
:
UUID=5f6b5c32-7c35-4efa-93c0-8e6b7dec83d9 /mnt/k3sdata ext4 defaults 0 0
Mount the disk:
sudo systemctl daemon-reload
sudo mount -a
Stop k3s:
sudo systemctl stop k3s-agent
/usr/local/bin/k3s-killall.sh
Move data:
sudo mv /run/k3s/ /mnt/k3sdata/k3s/
sudo mv /var/lib/kubelet/pods/ /mnt/k3sdata/k3s-pods/
sudo mv /var/lib/rancher/ /mnt/k3sdata/k3s-rancher/
Create symlinks to original folders:
sudo ln -s /mnt/k3sdata/k3s/ /run/k3s
sudo ln -s /mnt/k3sdata/k3s-pods/ /var/lib/kubelet/pods
sudo ln -s /mnt/k3sdata/k3s-rancher/ /var/lib/rancher
Restart k3s:
sudo systemctl start k3s-agent
More info:
I disabled installing the default Traefik ingress controller that comes with k3s. Instead I install it manually, so I have full control over the configuration.
Create values.yaml
:
# custom image repository
image:
registry: sonatype.jodibooks.com
# enable web dashboard
ports:
traefik:
expose: true
# fix the loadbalancer IP address
service:
externalIPs:
- 10.33.50.80
Add Helm repo:
helm repo add traefik https://helm.traefik.io/traefik
helm repo update
Install:
helm install --create-namespace -n traefik-system traefik traefik/traefik -f values.yaml
Nexus sonatype for Docker
Nexus sonatype for registry.k8s.io