Skip to content

Instantly share code, notes, and snippets.

@brwyatt
Last active November 15, 2025 20:14
Show Gist options
  • Select an option

  • Save brwyatt/7984b9c94a3d1319731f2c582d2ef006 to your computer and use it in GitHub Desktop.

Select an option

Save brwyatt/7984b9c94a3d1319731f2c582d2ef006 to your computer and use it in GitHub Desktop.
Diskless Desktop off Ceph RBD root
#!/bin/bash
# This is a helper script to be able to mount and manipulate the
# system from somewhere else when messing with the initramfs or
# for recovery operations.
if [ "${EUID}" -ne 0 ]; then
echo "Must be run as root"
exit 1
fi
ceph_client={{cephx_clientname}}
root_rbd={{pool}}/{{image}}
root_crypt_name={{local_mapped_name}}
root_mount_path={{local_mount_path}}
boot_cephfs={{cephfs_name}}
boot_cephfs_path={{path_in_cephfs}}
function unmount_all {
echo "========================"
echo "= Unmounting all paths ="
echo "========================"
umount "${root_mount_path}/dev/pts"
umount "${root_mount_path}/dev"
umount "${root_mount_path}/sys"
umount "${root_mount_path}/proc"
umount "${root_mount_path}/boot"
umount "${root_mount_path}"
cryptsetup luksClose "${root_crypt_name}"
rbd device unmap "/dev/rbd/${root_rbd}"
}
function mount_all {
echo "======================"
echo "= Mounting all paths ="
echo "======================"
rbd device map "${root_rbd}" --name="client.${ceph_client}"
cryptsetup luksOpen "/dev/rbd/${root_rbd}" "${root_crypt_name}"
mkdir -p "${root_mount_path}"
mount "/dev/mapper/${root_crypt_name}" "${root_mount_path}"
mkdir -p "${root_mount_path}/boot"
mount -t ceph "${ceph_client}"@."${boot_cephfs}"="${boot_cephfs_path}" "${root_mount_path}/boot" -o exec
mkdir -p "${root_mount_path}/proc"
mount -t proc proc "${root_mount_path}/proc"
mkdir -p "${root_mount_path}/sys"
mount -t sysfs sysfs "${root_mount_path}/sys"
mkdir -p "${root_mount_path}/dev"
mount --bind /dev "${root_mount_path}/dev"
mkdir -p "${root_mount_path}/dev/pts"
mount -t devpts pts "${root_mount_path}/dev/pts"
}
unmount_all
mount_all
echo "==================="
echo "= Entering chroot ="
echo "==================="
chroot "${root_mount_path}"
echo "=================="
echo "= Exiting chroot ="
echo "=================="
unmount_all

Diskless Desktop off Ceph RBD root

History/Background

This is the result of a project to PXE-boot a diskless desktop backed by a Ceph cluster. Initially, the plan was to use CephFS for both / (root) and /boot, with /boot accessible by the PXE server to be able to streamline updates to the initrd and kernel from the OS. While /boot is still on CephFS, / was moved to an RBD image, using namespaces for permissions.

Not covered

  • Setting up Ceph (CephFS, RBD, permissions/auth, etc), outside of where it directly interacts with the initrd process.
  • Setting up or configuring PXE
  • LUKS encryption

Assumptions

  • Ubuntu (both as build system and desktop image) - will probably work with other Debian
  • CephFS and Ceph RBD image are setup, configured, and mounted on a system to be used for building
  • PXE is setup (or will be setup separately)
  • iPXE firmware (lots of chainloading here)

Known Quirks

  • Need to make sure NetPlan is kept in-sync, specifically the static IP and MTU settings on the Ceph interface
  • For DNS resolution to work of the Ceph Mons, we MUST do DHCP on the user NIC and generate the /etc/resolv.conf (or possibly manually manipulate /etc/resolv.conf)

iPXE

While not explicitly covered, a brief explaination of the iPXE environment I use:

  • iPXE firmware (either on the NIC itself, or on a USB/SSD on the host)
  • Default boot.iPXE(via TFTP server provided by DHCP server) that chainloads to an HTTP server using $uuid and $mac parameters
  • Webserver (Nginx, etc) that uses UUID and MAC parameters to server the proper iPXE boot script
  • boot script loads kernel and initrd from Webserver using UUID and MAC again, and provides kernel parameters
  • Webserver serves correct kernel and initrd based on UUID and MAC

Tips/recommendations

  • Use RBD for / (and one image per host)
  • Use CephFS for /boot, use the same CephFS for all hosts, so have separate directories for each host
  • Put the iPXE scripts on the CephFS. Don't have to be in the host directories (but could be!)
  • Use namespaces to control access to the RBDs, and use fine-grained permissions
  • Use permissions on the CephFS so hosts (and the PXE server) can only access their own directory

Setup

Paths

These can be anywhere, so set these variables to the actual paths to the mounted drives/partitions.

root="/mnt/root"
boot="/mnt/boot"

Example mounts:

sudo mkdir -p "${root}" "${boot}"

sudo rbd device map POOL/NAMESPACE/IMAGE --name=client.CLIENT_NAME
sudo mount /dev/rbd0 "${root}"

sudo mount -t ceph [email protected]_NAME=/ "${boot}" -o exec

(note: replace the ALL_CAPS values as needed, and make sure /etc/ceph/ceph.conf and /etc/ceph/keyring are setup)

Create the Chroot environment

sudo debootstrap --arch amd64 noble "${root}" http://archive.ubuntu.com/ubuntu

sudo mount -t proc proc "${root}/proc"
sudo mount -t sysfs sysfs "${root}/sys"
sudo mount --bind /dev "${root}/dev"
sudo mount -t devpts pts "${root}/dev/pts"
sudo mount --bind "${boot}" "${root}/boot"

sudo chroot "${root}"

Setting up the image

Install dependencies:

apt install --no-install-recommends initramfs-tools linux-image-generic ceph-common systemd-sysv grub-pc-bin zstd vim acl cryptsetup-initramfs

(note: cryptsetup-initramfs only needed if using LUKS)

Add the files below to their proper places (filename is unimportant):

  • /etc/initramfs-tools/hooks/ceph
  • /etc/initramfs-tools/scripts/local-top/cephboot
    • Make sure to update MACs and {{RBD_IMAGE}} and {{CLIENT_NAME}} at the end

Install/configure Ceph:

  • /etc/ceph/ceph.conf
    • [global] section
      • mon_host (can't use mon_dns_srv_name as DNS doesn't really work
  • /etc/ceph/ceph.conf
    • [client.{{ CLIENT_NAME }}] - make sure this matches your client name!
      • key = {{KEY}} - this is the key for the user for Ceph

Setup /etc/fstab. Make sure to include root and boot:

/dev/rbd0 /       ext4    defaults        0       1
[email protected]_NAME=/PATH /boot   ceph    defaults        0       0

(note: change the / device if using cryptsetup, and make sure the CephFS values (the ALL_CAPS) for /boot are correct)

(Re-)Build the initrd:

update-initramfs -v -c -k "$(ls -t /boot/vmlinuz-* | head -n 1 | sed -r 's|/boot/vmlinuz-||')"

(note: this should (re-)build the initrd for the most recently installed kernel, specify the version explicitly if needed)

Add any users or update/create passwords for users (or root, etc), install anything useful (like ssh, ubuntu-desktop, etc) you'll want in the booted system. Also a good time to update NetPlan config as well, though can probably be done after first boot, too (but may cause some "fun" delays in the boot process without it, but shouldn't prevent boot)

Exit the chroot (^d or exit)

Unmount and boot!

sudo umount "${root}/proc"
sudo umount "${root}/sys"
sudo umount "${root}/dev/pts"
sudo umount "${root}/dev"
sudo umount "${root}/boot"
sudo umount "${root}"

Watch for any errors, and use -l if needed.

Also unmap the RBD device (if you are, in fact, using that here):

sudo rbd device unmap /dev/rbd0

You don't need to unmount CephFS, though, it's okay with concurrent access (RBD WILL have issues if it is mounted in more than one system)

Update (or create!) the iPXE script for the host and double check the root= param matches either /dev/rbd0 or the mapped cryptsetup device /dev/mapper/NAME.

#!/bin/sh
## IN /etc/initramfs-tools/hooks/
prereqs() {
echo ""
}
case "${1}" in
prereqs)
prereqs
exit 0
;;
esac
. /usr/share/initramfs-tools/hook-functions
# Copy the mount.ceph binary
copy_exec /sbin/mount.ceph
copy_exec /usr/bin/rbd
if [ -f "/etc/crypttab" ]; then
# Probably not needed, but also won't hurt. First attempt had
# failed so I added this as it was missing, but other things
# were probably the cause.
copy_file config /etc/crypttab
fi
if [ -d "/etc/ceph" ]; then
copy_file config /etc/ceph/ceph.conf
copy_file config /etc/ceph/keyring
fi
# Make sure the kernel can be read by the PXE web server
setfacl -m u:33:r /boot/vmlinuz-*
exit 0
#!/bin/sh
## IN: /etc/initramfs-tools/scripts/local-top/
case "$1" in
prereqs)
exit 0
;;
esac
error_shell() {
if [ -z "${1}" ]; then
error_shell "Unknown error"
fi
echo '!!!'" Zephyr: ${1}"'!'" Dropping to emergency shell. "'!!!'
/bin/sh
exit 0
}
. /scripts/functions
echo "=== Zephyr: Starting custom initramfs boot logic ==="
echo "=== Zephyr: Configuring network ==="
# Space-separated to make this work in Dash/ash/sh
CEPH_MACS="aa:bb:cc:11:22:33 ab:bb:cb:11:21:31"
#CEPH_MAC="ab:bb:cb:11:21:31"
CEPH_IP="192.168.5.50/24"
USER_MACS="aa:bb:cc:11:22:34 ab:bb:cb:11:21:32"
#USER_MAC="ab:bb:cb:11:21:32"
FOUND_CEPH_NIC=""
FOUND_USER_NIC=""
for iface in $(ip -o link show | awk -F': ' '{print $2}'); do
current_mac=$(ip link show "${iface}" | awk '/link\/ether/ {print $2}')
if [ "${FOUND_USER_NIC}" = "" ]; then
for USER_MAC in ${USER_MACS}; do
if [ "${current_mac}" = "${USER_MAC}" ]; then
FOUND_USER_NIC="${iface}"
echo "=== Found User NIC: ${FOUND_USER_NIC} ==="
break
fi
done
fi
if [ "${FOUND_CEPH_NIC}" = "" ]; then
for CEPH_MAC in ${CEPH_MACS}; do
if [ "${current_mac}" = "${CEPH_MAC}" ]; then
FOUND_CEPH_NIC="${iface}"
echo "=== Found Ceph NIC: ${FOUND_CEPH_NIC} ==="
break
fi
done
fi
done
wait_nic() {
timeout=30
if ! [ "${#}" = "0" ]; then
for nic in ${@}; do
echo "Waiting ${timeout} seconds for ${nic} to be up..."
count=0
while ! [ "$(cat /sys/class/net/${nic}/operstate)" = "up" ]; do
sleep 1
count=$((count + 1))
if [ $((count > timeout)) = 1 ]; then
error_shell "Timed out waiting for ${nic}!"
fi
done
done
fi
}
echo "=== Configuring ${FOUND_USER_NIC} for User access ==="
ip link set dev "${FOUND_USER_NIC}" up
wait_nic "${FOUND_USER_NIC}"
dhcpcd --oneshot --noipv4ll --waitip=4 "${FOUND_USER_NIC}"
echo "=== Configuring ${FOUND_CEPH_NIC} for Ceph access ==="
ip link set dev "${FOUND_CEPH_NIC}" up
# Wait for link up
wait_nic "${FOUND_CEPH_NIC}"
# Set MTU - make sure this matches the NetPlan config!
ip link set dev "${FOUND_CEPH_NIC}" mtu 9000
ip addr add "${CEPH_IP}" dev "${FOUND_CEPH_NIC}"
# Static IP ensures we don't have an interruptions of connectivity
# dhcpcd --nogateway --ipv4only --oneshot --waitip=4 "${FOUND_CEPH_NIC}"
echo "=== Configuring DNS ==="
# Make sure we have DNS working using the User NIC
netinfo_to_resolv_conf /etc/resolv.conf /run/"net-${FOUND_USER_NIC}.conf" /run/net-*.conf /run/net6-*.conf
echo "=== Zephyr: Mapping RBD ==="
rbd device map {{RBD_IMAGE}} --name=client.{{CLIENT_NAME}} || error_shell "Failed to mount Ceph RBD root"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment