Skip to content

Instantly share code, notes, and snippets.

@johnstultz-work
Last active November 21, 2025 05:02
Show Gist options
  • Select an option

  • Save johnstultz-work/be10ea093cb6bb29e09a54f3c85bd5cc to your computer and use it in GitHub Desktop.

Select an option

Save johnstultz-work/be10ea093cb6bb29e09a54f3c85bd5cc to your computer and use it in GitHub Desktop.
How I test and debug the Linux kernel scheduler

How I test and debug the Linux kernel scheduler

For my work on proxy-exec, I’ve built up a collection of tests that I use to try to strain things and uncover issues. I’m sure there are lots of better ways, however I am but a simple caveman. Folks have asked about my process so I figured I'd try to document it. I’m likely forgetting things, but I’ll try to update as I think of them.

General configs that are useful to enable to find bugs

(I'll usually keep a custom test defconfig in my kerenl trees that have the options I see as useful/helpful enabled so others can re-create the same config easily):

CONFIG_PROVE_LOCKING
CONFIG_DEBUG_RT_MUTEXES
CONFIG_DEBUG_SPINLOCK
CONFIG_DEBUG_MUTEXES
CONFIG_DEBUG_WW_MUTEX_SLOWPATH
CONFIG_DEBUG_RWSEMS
CONFIG_DEBUG_LOCK_ALLOC
CONFIG_LOCKUP_DETECTOR
CONFIG_SOFTLOCKUP_DETECTOR
CONFIG_HARDLOCKUP_DETECTOR

In-kernel tests:

  1. CONFIG_LOCK_TORTURE I then test using the following boot parameter for the configuration string: "torture.random_shuffle=1 locktorture.writer_fifo=1 locktorture.torture_type=mutex_lock locktorture.nested_locks=8 locktorture.rt_boost=1 locktorture.rt_boost_factor=50 locktorture.stutter=0 "
  2. CONFIG_WW_MUTEX_SELFTEST to exercise the ww-mutex die/wound logic. With my extension patches (hopefully to land upstream soon), I can trigger them to run repeatedly in a loop:
# while true; do echo 1 > /sys/kernel/test_ww_mutex/run_tests ; sleep 5; done
  1. My currently out-of-tree (ksched_football test)[https://github.com/johnstultz-work/linux-dev/commit/b28fa89f27b3d8466fe3f8374aa3ed76c79dde75] (CONFIG_SCHED_RT_INVARIANT_TEST). This can often starve the system and gives the dl_server a workout. Re-run repeatedly in a loop:
# while true; do echo 10 > /sys/kernel/ksched_football/start_game; sleep 120; done

Userland

  1. rt-tests: Collection of tests for testing the RT class. Usually I’ll run cyclictest to add some frequent RT preempts via:
# ./cyclictest -t -p99
  1. Priority-inversion-demo: A userspace test that demonstrates cgroup caused priority inversions and allows you to create and compare histograms. Often I will run this in a loop indefinitely
# while true; do ./run.sh ; sleep 1; done
  1. Kselftest cpu-hotplug test: Found in kernel source under tools/testing/selftests/cpu-hotplug/. I’ll usually run it in a loop like:
# while true; do ./cpu-on-off-test.sh -a; sleep 120; done
  1. stress-ng: An intense system stressor. May effectively DOS your system, so it's not always great for distinguishing between system overload and a bug. I’ll add it in when other stress testing hasn’t found anything. Run in a loop via:
# while true; do stress-ng -r `nproc` --timeout 300;  sleep 90; done

I don’t run all of the above together all the time. I tend to pick a collection of 4 or so of the above to run in parallel that don't completely overwhelm the system.

I sort of treat my trees by grades of stability: 10mins, 1+hrs, 6+hrs, 12+hrs, 48+hrs. Where when I'm actively hacking on things, I usually only run for ~10mins, then off-for-lunch is an hour, running for a chunk the work day, running overnight, and running over the weekend. I try to take every opportuntity to leave tests running when I can't be actively working on things. In a few situations where I had really tricky issues to debug, I'd have to leave it running for 70+ hours to trip the problem. So this stress testing isn't always the fastest way to find issues.

Other bug finding tricks:

  • I tend to do most of my testing in a x86 QEMU environment. I’m lucky to be able to run my QEMU VM with 64 cores, so that creates a lot of parallelism and makes it easier to trip races.
  • I will sometimes drop the “-enable-kvm” flag to QEMU. This really slows down the test environment (taking >20 minutes to boot with many of the boot time in-kernel stress tests enabled). However, the combination of high CPU counts, combined with very slow execution, seems to open a number of races up and this has been helpful in finding problems. However, I really have to find a lot of patience that I don’t usually have to test this way.

Debugging tricks:

  • I usually run qmeu with the argument to ensure I always have the serial console logged to a file: “-chardev stdio,id=char0,mux=on,logfile=serial.log,signal=off -serial chardev:char0 -mon chardev=char0 “
  • Always run qemu with the “-gdb tcp::1234” option. Also pass “nokaslr ” as a boot option to the kernel. Then if a hang or other problem arises you can easily debug the kernel by running gdb on the host machine:
$ gdb vmlinux -ex "target remote localhost:1234"
  • “printk.synchronous=1 “ as a boot argument has also been helpful when trying to chase down rare issue where printk loses lines
  • trace_printk() is your friend. Make sure you also have “ftrace_dump_on_oops ” as a kernel boot argument
  • When I suspect I’m hitting a race, or worried that there may be one present, I’ll add udelay(500); (sometimes going as high as 2000) or sometimes something like udelay(100*raw_smp_processor_id()); after a lock is released to try to open the windows where races might occur.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment