Production ready Kubernetes Cluster on Hetzer

Production ready Kubernetes Cluster on Hetzer

2021-02-01 06:00:53

A step by step tutorial how to create a production Kubernetes Cluster on the Hetzner infrastructure based on MicroK8s and Rook

We decided to start using Kubernetes in production. We were looking for a solution that is easy to install on Hetzner. Hetzner does not provide a managed Kubernetes Cluster, so you need to install it manually. There is a tool called hetzner-kube to build a K8s Cluster based on kubeadm, but I fell in love with MicroK8s and wanted to create a production ready Kubernteres Cluster based on it.

Hardware overview:

  • A load balancer with a domain and wildcald domain assigned to it

  • 3 servers (or more)

Software overview:

  • MicroK8s v1.19

  • Nginx ingress controller

  • Cert-manager with Let’s Encrypt Cluster Issuer

  • Rook with ceph v15.2.7

  • Kasten K10

Preconditions

  • a domain

  • installed hcloud with a configured project

  • basic Kubernetes knowledge (not realy needed, but welcome)

Create a load balancer

First of all we create a load balancer to have one IP address for the whole cluster.
hcloud load-balancer create --type=lb11 --name=k8s --location=fsn1
hcloud load-balancer add-service k8s --protocol=tcp --listen-port=80 --destination-port=80
hcloud load-balancer add-service k8s --protocol=tcp --listen-port=443 --destination-port=443
hcloud load-balancer add-service k8s --protocol=tcp --listen-port=16443 --destination-port=16443
We also create a label selector so the load balancer will automatically target our servers
hcloud load-balancer add-target k8s --label-selector=k8s
Now we grab the load balancer IP and create a DNS entry for it. (e.g k8s.exmaple.com) and a wildcard for subdomains (e.g *.k8s.example.com)

Create a server

We choose cpx31 server type. It has a good amound of resources and nice price. We choose ubuntu as image to be allowed to use snap packages easely. The k8s=production label is for the loadbalancer  
hcloud server create \
--image=ubuntu-20.04 \
--type=cpx31 \
--datacenter=fsn1-dc14 \
--ssh-key=<your_ssh_key_name> \
--label=k8s=production \
--name=k8s-1
Grab the server IP from the output
Waiting for server XXXXX to have started
 ... done
Server XXXXX created
IPv4: XXX.XXX.XXX.XXX

Adjust partition table (optional)

We need to adjust the partition table for Rook. 40GB for the system and everything else for Rook. This step is optional. You can also attach Volumes to you servers.

Enable rescue

hcloud server enable-rescue --ssh-key=<your_ssh_key_name> k8s-1
hcloud server reboot k8s-1

Ssh into the rescue system

hcloud server ssh k8s-1

Shrink the main partition and create an unformatted one

e2fsck -f /dev/sda1
resize2fs /dev/sda1 40G
printf "d\n1\nn\n1\n\n+40G\nn\n2\n\n\nt\n2\n31\nw\n" | fdisk -B /dev/sda
e2fsck -f /dev/sda1
reboot

Update the system (optional)

It’s allways a good idea to update the system.
hcloud server ssh k8s-1

apt update && apt upgrade -y
reboot

Install MicroK8s

hcloud server ssh k8s-1

apt install -y snapd
snap install microk8s --classic --channel 1.19

Make MicroK8s aware of the load balancer IP

sed -i '/#MOREIPS/a IP.100 = <load_balancer_ip>' /var/snap/microk8s/current/certs/csr.conf.template
microk8s.refresh-certs
exit

Install basic addons

microk8s.enable dns:1.1.1.1
microk8s.enable ingress   
microk8s.enable metrics-server
microk8s.enable rbac
microk8s.enable helm3

Create two more servers

We need minimum 3 servers to have a proper cluster. To avoid repetition of the partition adjustments, we will create a snapshot and then create 2 more servers from it.

Create a snapshot as base for other servers

hcloud server create-image --type=snapshot k8s-1
This will take ~10min, so go have some coffee 😉 Grab the snaphot_id   from the output.
Image
<snaphost_id>
 created from server YYY

Create more servers

We will use a the snapshot and a simper cloud init script to bootstrap the servers and let them join the cluster. Create a cloud-config.yaml file. Remember to deplate <first_node_ip> and <token>.
#cloud-config
runcmd:
  - /snap/bin/microk8s join <first_node_ip>:25000/<token>
You can run this commands parallel.
hcloud server create \
--label k8s= \
--image=<snaphot_id> \
--type=cpx31 \
--datacenter=fsn1-dc14 \
--ssh-key=<your_ssh_key_name> \
--name=k8s-2 

hcloud server create \
--label k8s= \
--image=<snaphot_id> \
--type=cpx31 \
--datacenter=fsn1-dc14 \
--ssh-key=<your_ssh_key_name> \
--name=k8s-3

Install cert-manager

Deploy the cert manager using helm

hcloud server ssh k8s-1

microk8s.kubectl create namespace cert-manager
microk8s.helm3 repo add jetstack https://charts.jetstack.io
microk8s.helm3 repo update
microk8s.helm3 upgrade --install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --version v1.1.0 \
  --set installCRDs=true \
  --set ingressShim.defaultIssuerName=letsencrypt-prod \
  --set ingressShim.defaultIssuerKind=ClusterIssuer \
  --set ingressShim.defaultIssuerGroup=cert-manager.io

Create a cluster issuer for Let’s Encrypt

Create a cluster-issuer.yaml file.
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: info@example.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
microk8s.kubectl create -f cluster-issuer.yaml

Install kubernetes dashboard

hcloud server ssh k8s-1

microk8s.enable dashboard
To be able to access the dashboard we need to create an ingress resource. Create a dashboard.yaml file.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/tls-acme: "true"
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
  name: dashboard
  namespace: kube-system
spec:
  rules:
  - host: <your_domain_here>
    http:
      paths:
      - backend:
          serviceName: kubernetes-dashboard
          servicePort: 443
        path: /
  tls:
  - hosts:
    - <your_domain_here>
    secretName: dashboard-ingress-cert
microk8s.kubectl create -f dashboard.yaml

Install Rook

hclound server ssh k8s-1

Ceph Operator

Install the Ceph Operator using Helm

Becouse microk8s comes as a snap package with bundled kubelet, we need to tell the rook operator about it.

We also set enableDiscoveryDaemon=true to enable autodiscovery of hardware changes.

microk8s.helm3 repo add rook-release https://charts.rook.io/release

microk8s.kubectl create namespace rook-ceph

microk8s.helm3 upgrade --install \
--set csi.kubeletDirPath=/var/snap/microk8s/common/var/lib/kubelet/ \
--set enableDiscoveryDaemon=true \
--namespace rook-ceph \
rook-ceph rook-release/rook-ceph

Ceph Cluster

Again, becouse of the snap, we need adjust some paths.
dataDirHostPath: /var/snap/microk8s/common/var/lib/rook
Additionalaly we change the cehp/ceph image version to an older revision, becouse the current one has a bug and will not discover our partition. Create a rook-cluster.yaml file with following content.
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: ceph/ceph:v15.2.7
    allowUnsupported: false
  dataDirHostPath: /var/snap/microk8s/common/var/lib/rook
  mon:
    count: 3
    allowMultiplePerNode: false
  mgr:
    modules:
    - name: pg_autoscaler
      enabled: true
  dashboard:
    enabled: true
    ssl: true
  monitoring:
    enabled: true
    rulesNamespace: rook-ceph
  network:
  crashCollector:
    disable: false
  cleanupPolicy:
    confirmation: ""
    sanitizeDisks:
      method: quick
      dataSource: zero
      iteration: 1
    allowUninstallWithVolumes: false
  storage:
    useAllNodes: true
    useAllDevices: true
  disruptionManagement:
    managePodBudgets: false
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0
    manageMachineDisruptionBudgets: false
    machineDisruptionBudgetNamespace: openshift-machine-api
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
microk8s.kubectl create -f rook-cluster.yaml

Block Storage Class

This will be the default storage class. Create a storageclass.yaml file with following content.
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
   annotations:
     storageclass.kubernetes.io/is-default-class: "true"
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    clusterID: rook-ceph
    pool: replicapool

    imageFormat: "2"

    imageFeatures: layering

    csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph

    csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
For detailed information see https://rook.io/docs/rook/v1.5/ceph-block.html
microk8s.kubectl create -f storageclass.yaml

Shared Filesystem (ReadWriteMany)

create a rwm-storageclass.yaml file with following content
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: myfs
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
      size: 3
  dataPools:
    - replicated:
        size: 3
  preserveFilesystemOnDelete: true
  metadataServer:
    activeCount: 1
    activeStandby: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  clusterID: rook-ceph
  fsName: myfs
  pool: myfs-data0

  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph

reclaimPolicy: Delete
microk8s.kubectl create -f rwm-storageclass.yaml
You can now create a PermisionVolumeClaim like this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc
  namespace: kube-system
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
storageClassName: rook-cephfs Why is this storage class not the default one when it is more flexible? Becouse block storage is faster and is what you will need most the times.

Object Storage (S3 API)

create a object-storageclass.yaml file with following content
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: s3-store
  namespace: rook-ceph
spec:
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPool:
    failureDomain: host
    erasureCoded:
      dataChunks: 2
      codingChunks: 1
  preservePoolsOnDelete: true
  gateway:
    type: s3
    sslCertificateRef:
    port: 80
    # securePort: 443
    instances: 1
  healthCheck:
    bucket:
      disabled: false
      interval: 60s
microk8s.kubectl create -f object-storageclass.yaml
The easiest way to create butkets is through the Ceph Dashboard

Accessing Ceph dashboard with Ingress

Create a rook-ingress.yaml file with the following content:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: rook-ingress
  namespace: rook-ceph
  annotations:
    kubernetes.io/ingress.class: "nginx"
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    nginx.ingress.kubernetes.io/server-snippet: |
      proxy_ssl_verify off;
spec:
  tls:
   - hosts:
     - ceph.<your-domain-here>
     secretName: rook-tls
  rules:
  - host: ceph.<your-domain-here>
    http:
      paths:
      - path: /
        backend:
          serviceName: rook-ceph-mgr-dashboard
          servicePort: https-dashboard
Install the chart using the values file
microk8s.kubectl create -f rook-ingress.yaml
The dashboard is now accessible under https://ceph.<your_domain_here>

Login Credentials

Rook creates a default user named admin To retrieve the generated password, you can run the following:
microk8s.kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo

Backups with Kasten K10

Login to the k8s-1 server where we have helm installed

hcloud server ssh k8s-1

Enable Snaphosts

Snapshot Beta CRDs

microk8s.kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
microk8s.kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
microk8s.kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

Common Snapshot Controller

microk8s.kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
microk8s.kubectl create -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml

VolumeSnapshotClass

We create a volumesnapshotclass for your block storage and annotate it with 10.kasten.io/is-snapshot-class=true
microk8s.kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.5/cluster/examples/kubernetes/ceph/csi/rbd/snapshotclass.yaml
microk8s.kubectl annotate volumesnapshotclass csi-rbdplugin-snapclass k10.kasten.io/is-snapshot-class=true
We do the same for the cehpfs storage.
microk8s.kubectl create -f https://raw.githubusercontent.com/rook/rook/release-1.5/cluster/examples/kubernetes/ceph/csi/cephfs/snapshotclass.yaml
microk8s.kubectl annotate volumesnapshotclass csi-cephfsplugin-snapclass k10.kasten.io/is-snapshot-class=true

Installing Kasten K10

Now we will install Kasten K10 to take care of our backups. We will use a helm chart for it. As part of it we will create a ingress to access the dashboard.
microk8s.helm3 repo add kasten https://charts.kasten.io/
microk8s.kubectl create namespace kasten-io
Create a k10-values.yaml file with the following content:
ingress:
  create: true
  class: nginx
  host: <your_doamin_here>
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
auth:
  tokenAuth:
    enabled: true
Install the chart using the values file
microk8s.helm3 install k10 kasten/k10 -n kasten-io -f k10-values.yaml
The dashboard is now accessible under https://<your_domain_here>/k10

Obtaining access token to access K10 Dashboard

sa_secret=$(microk8s.kubectl get serviceaccount k10-k10 -o jsonpath="{.secrets[0].name}" --namespace kasten-io)

microk8s.kubectl get secret $sa_secret --namespace kasten-io -ojsonpath="{.data.token}" | base64 --decode && echo

Some Questions?

Don’t hesitate to contact us. We will answer all your questions as soon as possible.