Kubespray K8S

The collection contains and integrates with the official kubespray collection.

It is recommended to create a seperate repository per kubernetes cluster, or aleast put the cluster inventory file into a seperate folder.

Checkout the kubespray-cluster dir in the samples directory for a quick idea on how to setup!

Deploying a cluster

Again create custom inventory yaml file following this k8s cluster schema.

Afterwards run the ansible-playbook -i YOUR-KUBESPRAY-INV.yaml pxc.cloud.sync_kubespray playbook, this will fully create VMs, setup kubespray, initialize TLS Certificates and deploy core kubernetes helm charts (Ceph CSI, Ingress).

To get kubeconf after creation of the cluster for cli/ide access (expiring) use:

pvcli print-kubeconfig --inventory YOUR-KUBESPRAY-INV.yaml

Ceph CSI

If you have ceph installed and configured in your proxmox cluster, and you define ceph_csi_sc_pools in your kubespray inventory, the ceph csi volume driver will be automatically configured and installed.

If you want to use ceph, your kubernetes nodes need to have access to the ceph monitors. If your ceph is on a seperate network you can deploy a seperate kea dhcp for dynamically assinging ips to the nodes ceph network interfaces. For that you need a single lxc and the pxc.cloud.setup_ceph_kea playbook.

TLS ACME Certificates

This collection doesn't use kubernetes certmanager for TLS certificates, but instead comes with an external centralised solution.

Initially certificates are generated via ansible roles, integrated into the collection. The update process afterwards is handled via AWX cron jobs. Use the awx helm chart via terraform to deploy your own instance.

DNS Provider Secrets

At the moment the collection supports ionos and aws route53 for dynamically solving dns01 challenges and obtaining certificates.

To make the secrets available to proxmox cloud take a look inside the terraform config inside the cloud-instance sample.

We need to use the pxc_cloud_age_secret resource to create the secrets.

for aws route53 create aws-route53-global.json, and encode it with age -R ~/.ssh/id_ed25519.pub aws-route53-global.json | base64 -w0:

{
  "AWS_ACCESS_KEY_ID": "ACCESS_ID_HERE",
  "AWS_SECRET_ACCESS_KEY": "SECRET_KEY_HERE",
  "AWS_REGION": "REGION"
}

the json structure and secret_name is essential and needs to be aws-route53-global

resource "pxc_cloud_age_secret" "route53" {
  secret_name = "aws-route53-global" 
  b64_age_data = "" # output of `age -R ~/.ssh/id_ed25519.pub aws-route53-global.json | base64 -w0`
}

for method ionos create ionos-api-key.json:

{
  "IONOS_PRAEFIX": "PRAEFIX_HERE",
  "IONOS_VERSCHLUESSELUNG": "SECRET_KEY_HERE"
}

resource "pxc_cloud_age_secret" "ionos" {
  secret_name = "ionos-api-key" 
  b64_age_data = "" # output of `age -R ~/.ssh/id_ed25519.pub ionos-api-key.json | base64 -w0`
}

The ssh key needs to only be present during the initial create, the purpose is to simply not have secrets in your repositories. If you dont care about that security you can just use pxc_cloud_secret resource and hardcode your secrets in your tf file, using the same secret names as above.

You are welcome to create and submit a MR with your own roles / logic for other providers!

Upgrading a cluster

Upgrading the cluster is as simple as updating the version tag reference of this collection.

You can skip to the latest patch version, but shouldn't skip minor versions as they are tied to kubespray updates. After updating the cloud collection version in your requirements.yaml, you have to run ansible-galaxy install -r requirements.yaml and pip install -r ~/.ansible/collections/ansible_collections/pxc/cloud/meta/ee-requirements.txt again.

Then run the upgrade playbook ansible-playbook -i YOUR-KUBESPRAY-INV.yaml pxc.cloud.upgrade_kubespray.

Custom kubespray vars

To define your own kubespray vars just create group_vars/all and group_vars/k8s_cluster directories alongside your kubespray inventory yaml file.

Here are some interesting kubespray settings you might want to set (k8s_cluster vars):

increase amount of schedulable pods per node (if you have big nodes)

kube_network_node_prefix: 22
kubelet_max_pods: 1024

set strict eviction limits, this is a good safe guard for node availability incase you have memory hungry deployments without any requests / limits defined

# this will enable reservations with the default values, see kubespray sample inventory
kube_reserved: true
kube_reserved_cgroups_for_service_slice: kube.slice
kube_reserved_cgroups: "/{{ kube_reserved_cgroups_for_service_slice }}"

system_reserved: true
system_reserved_cgroups_for_service_slice: system.slice
system_reserved_cgroups: "/{{ system_reserved_cgroups_for_service_slice }}"
system_memory_reserved: 1024Mi

eviction_hard:
  memory.available: 1000Mi

=> this, in addition to the pxc.cloud. role, will allow k8s nodes to run even if we got memory hungry, ram hogging deployments. eviction hard and reservations alone are not enough, in oom scenarios it will cause the networkd service to stop working. This role is automatically executed via the pxc.cloud.sync_kubespray playbook.

Exposing K8S Controlplane API

If you want to expose your kubenetes clusters controlplane to integrate with external running services, you can set additional SANs that will be generated and inserted by kubespray into the kubeapi certificates, by listing them in your kubespray inventory file under extra_control_plane_sans as simple strings.

Adding SANs there will also configure the pve cloud haproxy to route any control plane traffic (6443) on its external ip to respecive cluster.

DNS Records for these SANs have to be created manually (for internal and external DNS servers), for that use terraforms dns, route53 and ionos provider.