Skip to content

Instantly share code, notes, and snippets.

@vxav
Last active October 30, 2025 16:53
Show Gist options
  • Select an option

  • Save vxav/609a011d97a973e88b445b43c4b13a62 to your computer and use it in GitHub Desktop.

Select an option

Save vxav/609a011d97a973e88b445b43c4b13a62 to your computer and use it in GitHub Desktop.

Overview

Kamaji allows to run control plane components in pods in the MC rather than dedicating nodes in the WC (doc).

image

Kamaji world

Those pods are controlled through tenantcontrolplanes.kamaji.clastix.io CRs which are reconciled by the kamaji controller.

Kamaji controller installed from upstream helm repo

cat kamaji-values.yaml
resources:
  limits:
    memory: 250Mi
    
helm install kamaji kamaji/charts/kamaji -n org-xav --version 0.0.0+latest -f kamaji-values.yaml

The "control plane" pods run each controller in a container (3 containers pods). K9s extract below.

NAME↑                    PF IMAGE                                            READY STATE    INIT   RESTARTS PROBES(L:R) CPU MEM CPU/R:L MEM/R:L %CPU/R %CPU/L %MEM/R %MEM/L PORTS │
│ kube-apiserver           ●  registry.k8s.io/kube-apiserver:v1.32.9           true  Running  false         0 on:on        14 292     0:0     0:0    n/a    n/a    n/a    n/a       │
│ kube-controller-manager  ●  registry.k8s.io/kube-controller-manager:v1.32.9  true  Running  false         0 on:off        2  20     0:0     0:0    n/a    n/a    n/a    n/a       │
│ kube-scheduler           ●  registry.k8s.io/kube-scheduler:v1.32.9           true  Running  false         0 on:off        2  23     0:0     0:0    n/a    n/a    n/a    n/a       │

Cluster state (etcd) is stored in a configured datastore controlled through datastores.kamaji.clastix.io CRs and reconciled by the kamaji controller. (An etcd cluster is created with default values of the chart).

CAPI world

There is a control plane provider for kamaji capi-kamaji-controller-manager which (in our case) replaces capi-kubeadm-control-plane-controller-manager.

Controller installed in grouse via kustomize build on https://github.com/clastix/cluster-api-control-plane-provider-kamaji/blob/master/config/default/kustomization.yaml.

  • Memory limit must be raised from 128 to 256.
  • Drop variables in controller args.

When creating a KamajiControlPlane CR, the CP provider will create the TenantControlPlane which will trigger the creation of pods.

> kg tenantcontrolplanes.kamaji.clastix.io proxmox-kamaji -ojsonpath='{.metadata.ownerReferences}'|jq
[
  {
    "apiVersion": "controlplane.cluster.x-k8s.io/v1alpha1",
    "blockOwnerDeletion": true,
    "controller": true,
    "kind": "KamajiControlPlane",
    "name": "proxmox-kamaji",
    "uid": "47b6b51b-c61c-420b-8e60-f46f80f6f001"
  }
]

Manifests!

Policy exceptions

apiVersion: kyverno.io/v2
kind: PolicyException
metadata:
  labels:
    app.kubernetes.io/managed-by: kyverno-policy-operator
  name: capi-kamaji-controller-manager
  namespace: policy-exceptions
spec:
  background: true
  exceptions:
  - policyName: restrict-seccomp-strict
    ruleNames:
    - autogen-check-seccomp-strict
    - check-seccomp-strict
  match:
    any:
    - resources:
        kinds:
        - Deployment
        - ReplicaSet
        - Pod
        names:
        - capi-kamaji-controller-manager*
        namespaces:
        - kamaji-system
---
apiVersion: kyverno.io/v2
kind: PolicyException
metadata:
  labels:
    app.kubernetes.io/managed-by: kyverno-policy-operator
  name: kamaji
  namespace: policy-exceptions
spec:
  background: true
  exceptions:
  - policyName: disallow-capabilities-strict
    ruleNames:
    - autogen-require-drop-all
    - require-drop-all
  - policyName: disallow-privilege-escalation
    ruleNames:
    - autogen-privilege-escalation
    - privilege-escalation
  - policyName: require-run-as-nonroot
    ruleNames:
    - autogen-run-as-non-root
    - run-as-non-root
  - policyName: restrict-seccomp-strict
    ruleNames:
    - autogen-check-seccomp-strict
    - check-seccomp-strict
  match:
    any:
    - resources:
        kinds:
        - Deployment
        - ReplicaSet
        - StatefulSet
        - Pod
        - Job
        names:
        - kamaji-etcd-setup-1-*
        - kamaji*
        - kamaji-etcd*
        namespaces:
        - kamaji-system
        - org-xav
---
apiVersion: kyverno.io/v2
kind: PolicyException
metadata:
  labels:
    app.kubernetes.io/managed-by: kyverno-policy-operator
  name: kamaji-tenantcontrolplane
  namespace: policy-exceptions
spec:
  background: true
  exceptions:
  - policyName: disallow-capabilities-strict
    ruleNames:
    - autogen-require-drop-all
    - require-drop-all
  - policyName: disallow-privilege-escalation
    ruleNames:
    - autogen-privilege-escalation
    - privilege-escalation
  - policyName: require-run-as-nonroot
    ruleNames:
    - autogen-run-as-non-root
    - run-as-non-root
  - policyName: restrict-seccomp-strict
    ruleNames:
    - autogen-check-seccomp-strict
    - check-seccomp-strict
  match:
    any:
    - resources:
        kinds:
          - Deployment
          - Pod
          - ReplicaSetk
        namespaces:
          - org-xav
        names:
          - proxmox-kamaji*

Proxmox 2 nodes nodepool with Kamaji control plane (Apply in grouse to test it)

(Some upstream examples > https://github.com/clastix/cluster-api-control-plane-provider-kamaji/tree/master/docs)

Note that KamajiControlPlane.spec.network.serviceAddress doesn't seem to work as I had to add the IP to the service.

(I didn't bother adding an ipAddressClaim but I picked the last one from the pool so yolo).

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  clusterNetwork:
    apiServerPort: 6443
    pods:
      cidrBlocks:
      - 10.244.0.0/16
    services:
      cidrBlocks:
      - 172.31.0.0/16
  controlPlaneEndpoint:
    host: 10.201.24.241
    port: 6443
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1alpha1
    kind: KamajiControlPlane
    name: proxmox-kamaji
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
    kind: ProxmoxCluster
    name: proxmox-kamaji
---
apiVersion: controlplane.cluster.x-k8s.io/v1alpha1
kind: KamajiControlPlane
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  dataStoreName: default
  addons:
    #coreDNS: { }
    #kubeProxy: { }
  kubelet:
    cgroupfs: systemd
    preferredAddressTypes:
    - InternalIP
  network:
    serviceType: LoadBalancer
    serviceAddress: 10.201.24.241
  deployment:
  replicas: 3
  version: 1.32.9
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxCluster
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  allowedNodes:
  - pve
  controlPlaneEndpoint:
    host: 10.201.24.241
    port: 6443
  ### That field below is important for proxmox ###
  externalManagedControlPlane: true
  dnsServers:
  - 8.8.8.8
  - 1.1.1.1
  ipv4Config:
    addresses:
    - 10.201.24.160-10.201.24.170
    gateway: 10.201.25.254
    prefix: 23
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  clusterName: proxmox-kamaji
  replicas: 2
  selector:
    matchLabels: null
  template:
    metadata:
      labels:
        node-role.kubernetes.io/node: ""
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: proxmox-kamaji
      clusterName: proxmox-kamaji
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
        kind: ProxmoxMachineTemplate
        name: proxmox-kamaji
      version: v1.32.9
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha1
kind: ProxmoxMachineTemplate
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  template:
    spec:
      checks:
        skipCloudInitStatus: true
        skipQemuGuestAgent: false
      disks:
        bootVolume:
          disk: scsi0
          sizeGb: 32
      format: qcow2
      full: true
      memoryMiB: 8048
      network:
        default:
          bridge: vmbr0
          model: virtio
      numCores: 4
      numSockets: 1
      sourceNode: pve
      templateID: 102
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: proxmox-kamaji
  namespace: org-xav
spec:
  template:
    spec:
      format: ignition
      ignition:
        containerLinuxConfig:
          additionalConfig: |-
            storage:
              files:
              - path: /opt/set-hostname
                filesystem: root
                mode: 0744
                contents:
                  inline: |
                    #!/bin/sh
                    set -x
                    echo "${COREOS_CUSTOM_HOSTNAME}" > /etc/hostname
                    hostname "${COREOS_CUSTOM_HOSTNAME}"
                    echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
                    echo "127.0.0.1   localhost" >>/etc/hosts
                    echo "127.0.0.1   ${COREOS_CUSTOM_HOSTNAME}" >>/etc/hosts
            systemd:
              units:
              - name: coreos-metadata.service
                contents: |
                  [Unit]
                  Description=Proxmox metadata agent
                  After=nss-lookup.target
                  After=network-online.target
                  Wants=network-online.target
                  [Service]
                  Type=oneshot
                  Restart=on-failure
                  RemainAfterExit=yes
                  EnvironmentFile=/etc/proxmox-env
                  ExecStart=/usr/bin/mkdir --parent /run/metadata
                  ExecStart=/bin/bash -c 'env > /run/metadata/flatcar'
                  ExecStart=/bin/bash -c 'echo "COREOS_CUSTOM_IPV4=$COREOS_CUSTOM_PRIVATE_IPV4" | cut -d"/" -f1 >> /run/metadata/flatcar'
                  [Install]
                  WantedBy=multi-user.target
              - name: set-hostname.service
                enabled: true
                contents: |
                  [Unit]
                  Description=Set the hostname for this machine
                  Requires=coreos-metadata.service
                  After=coreos-metadata.service
                  [Service]
                  Type=oneshot
                  EnvironmentFile=/run/metadata/flatcar
                  ExecStart=/opt/set-hostname
                  [Install]
                  WantedBy=multi-user.target
              - name: kubeadm.service
                enabled: true
                dropins:
                - name: 10-flatcar.conf
                  contents: |
                    [Unit]
                    # kubeadm must run after coreos-metadata populated /run/metadata directory.
                    Requires=coreos-metadata.service
                    After=coreos-metadata.service
                    # kubeadm must run after containerd - see https://github.com/kubernetes-sigs/image-builder/issues/939.
                    After=containerd.service
                    [Service]
                    # Make metadata environment variables available for pre-kubeadm commands.
                    EnvironmentFile=/run/metadata/flatcar
                    # Log to file
                    StandardOutput=append:/var/log/kubeadm-service.log
                    StandardError=inherit
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            provider-id: proxmox://'${COREOS_CUSTOM_INSTANCE_ID}'
            cgroup-driver: systemd
            # cloud-provider: external
            healthz-bind-address: 0.0.0.0
            node-ip: ${COREOS_CUSTOM_IPV4}
            node-labels: ip=${COREOS_CUSTOM_IPV4},role=worker,giantswarm.io/machine-pool=proxmox-kamaji,infra=proxmox
            v: "2"
          name: ${COREOS_CUSTOM_HOSTNAME}
      preKubeadmCommands:
      # - rm /etc/proxmox-env
      - envsubst < /etc/kubeadm.yml > /etc/kubeadm.yml.tmp
      - cp /etc/kubeadm.yml.tmp /etc/kubeadm.yml
      users:
      - name: core
        sshAuthorizedKeys:
        - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCvKYb6bE+V2QXgOQKPbaL9ya8PKEc8YJBdfubajYz/ttHGiSVfQkP/qwRsGybp2w5wsxeBoaG7PlYaN5q6ZGS3T2CZZqTx/M/Ww3Sy3zvfMJWyzXRS/JQkQ2+kdBFDfWN7PuWytJUU6HjDqWDetfOGAnGJBoucRPIpY4kBK4iomRMw6/CZzzsAJGx3//2pUDqGbrHCBhrhga/giRplskR/Skr9bi/aG++GgS2hB/r2iJM+1cyFxpbtcoiH/L/WIAFUKXbANJO/2dh/LvwoETsXOKXSg+LLqoRuCZt+WzVt971pcyn00cOyvaxxBMnOki49mVsGEZPtsZvdc5oxbWEl9auVw2CDgOlSh6NUzm54aC/sS+RalWxO7+qR5kehA8yfKKf9KMPUlI5HPpa4+DBWdVC58YbVhAqANOGz1z1P1KmUbebarMjJVox3jw3bocjy4H0mPBwBI5O7PEV+aRPNRUzXghbbuqI2m4+ivmiPLtEm2xwI9bNOOcv5HZOXWjE=
        sudo: ALL=(ALL) NOPASSWD:ALL
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
metadata:
  name: proxmox-kamaji-default
  namespace: org-xav
spec:
  interval: 5m
  provider: generic
  timeout: 60s
  url: https://giantswarm.github.io/default-catalog
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: proxmox-kamaji-cilium
  namespace: org-xav
spec:
  chart:
    spec:
      chart: cilium
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: HelmRepository
        name: proxmox-kamaji-default
      version: 1.3.1
  install:
    remediation:
      retries: 0
  interval: 5m
  kubeConfig:
    secretRef:
      name: proxmox-kamaji-kubeconfig
  releaseName: cilium
  storageNamespace: kube-system
  targetNamespace: kube-system
  timeout: 1h
  upgrade:
    disableWait: false
    remediation:
      retries: 0
  values:
    global:
      podSecurityStandards:
        enforced: true
    hubble:
      relay:
        enabled: true
        tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
          operator: Exists
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
          operator: Exists
        - effect: NoSchedule
          key: node.cloudprovider.kubernetes.io/uninitialized
          operator: Equal
          value: "true"
      ui:
        tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
          operator: Exists
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
          operator: Exists
        - effect: NoSchedule
          key: node.cloudprovider.kubernetes.io/uninitialized
          operator: Equal
          value: "true"
    ipam:
      mode: kubernetes
    k8sServiceHost: auto
    kubeProxyReplacement: "true"
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: proxmox-kamaji-coredns
  namespace: org-xav
spec:
  chart:
    spec:
      chart: coredns-app
      reconcileStrategy: ChartVersion
      sourceRef:
        kind: HelmRepository
        name: proxmox-kamaji-default
      version: 1.28.2
  dependsOn:
  - name: proxmox-kamaji-cilium
    namespace: org-xav
  install:
    remediation:
      retries: -1
  interval: 5m
  kubeConfig:
    secretRef:
      name: proxmox-kamaji-kubeconfig
  releaseName: coredns
  storageNamespace: kube-system
  targetNamespace: kube-system
  timeout: 5m
  upgrade:
    disableWait: false
    remediation:
      retries: -1
  values:
    # Required to disable pods on masters
    mastersInstance:
      enabled: false
    cluster:
      calico:
        CIDR: 10.244.0.0/16
      kubernetes:
        API:
          clusterIPRange: 172.31.0.0/16
        DNS:
          IP: 172.31.0.10
    global:
      podSecurityStandards:
        enforced: true

Cluster upgrade

Just change KamajiControlPlane.spec.version and the pods roll to the new version.

The TenantControlPlane specs look like

spec:
  addons: {}
  controlPlane:
    deployment:
      additionalMetadata: {}
      additionalVolumeMounts: {}
      extraArgs: {}
      podAdditionalMetadata: {}
      registrySettings:
        apiServerImage: kube-apiserver
        controllerManagerImage: kube-controller-manager
        registry: registry.k8s.io
        schedulerImage: kube-scheduler
      replicas: 3
      resources:
        apiServer: {}
        controllerManager: {}
        kine: {}
        scheduler: {}
      serviceAccountName: default
      strategy: {}
    service:
      additionalMetadata: {}
      serviceType: LoadBalancer
  dataStore: default
  dataStoreSchema: org_xav_proxmox_kamaji
  dataStoreUsername: org_xav_proxmox_kamaji
  kubernetes:
    admissionControllers:
    - CertificateApproval
    - CertificateSigning
    - CertificateSubjectRestriction
    - DefaultIngressClass
    - DefaultStorageClass
    - DefaultTolerationSeconds
    - LimitRanger
    - MutatingAdmissionWebhook
    - NamespaceLifecycle
    - PersistentVolumeClaimResize
    - Priority
    - ResourceQuota
    - RuntimeClass
    - ServiceAccount
    - StorageObjectInUseProtection
    - TaintNodesByCondition
    - ValidatingAdmissionWebhook
    kubelet:
      cgroupfs: systemd
      preferredAddressTypes:
      - InternalIP
    version: v1.33.5
  networkProfile:
    address: 10.201.24.241
    clusterDomain: cluster.local
    podCidr: 10.244.0.0/16
    port: 6443
    serviceCidr: 172.31.0.0/16
@vxav
Copy link
Author

vxav commented Oct 30, 2025

I suspect the main hurdle here would be the templating of the control plane components in the GS charts as we currently work with kubeadmControlPlane resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment