Skip to content

Instantly share code, notes, and snippets.

@Karol-G
Last active November 11, 2025 10:14
Show Gist options
  • Select an option

  • Save Karol-G/8715d9616e4d1b8b4e2f526b713632ca to your computer and use it in GitHub Desktop.

Select an option

Save Karol-G/8715d9616e4d1b8b4e2f526b713632ca to your computer and use it in GitHub Desktop.
nvidia-htop: nvidia-smi with a pid username mapping

🧠 nvidia-htop

nvidia-htop is a lightweight Bash utility that provides an htop-style overview of GPU processes, showing the GPU index, PID, username, and command name for each active NVIDIA process — followed by the standard nvidia-smi summary.

It’s designed for users without root access and integrates cleanly into your shell environment.


🚀 Features

  • Displays a clean ASCII table of GPU → PID → USER → COMMAND mappings.
  • Works with both CUDA compute and graphics (PMON) processes.
  • Shows command name.
  • Automatically falls back if /proc or NVML info is missing.
  • No sudo required.
  • Works perfectly with watch for live monitoring.

Example:

+-----+----------+--------------+------------------------------------------------------------+
| GPU | PID      | USER         | COMMAND                                                    |
+-----+----------+--------------+------------------------------------------------------------+
| 1   | 3273371  | s539y        | python train.py --cfg configs/exp1.yaml --epochs 50        |
| 3   | 3204386  | s539y        | torchrun --nproc_per_node=2 main.py --exp nnunetv2         |
+-----+----------+--------------+------------------------------------------------------------+

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.07    Driver Version: 580.82.07    CUDA Version: 13.0     |
|===============================+======================+======================|
|   1  NVIDIA A100-SXM4-40GB | 55C | P0 | 309W / 400W | 20241MiB / 40960MiB | 87% |
|   3  NVIDIA A100-SXM4-40GB | 54C | P0 | 291W / 400W | 20243MiB / 40960MiB | 89% |
+-----------------------------------------------------------------------------+


🧩 Installation (No sudo required)

  1. Create your personal bin folder

    mkdir -p ~/.local/bin
  2. Add it to your PATH Append this line to ~/.profile (so it works for all sessions):

    echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.profile
    source ~/.profile

    To verify:

    echo $PATH | tr ':' '\n' | grep local/bin

    You should see:

    /home/<youruser>/.local/bin
    
  3. Create the script

    nano ~/.local/bin/nvidia-htop

    Paste the the content from nvidia-htop into the file.

  4. Make it executable

    chmod +x ~/.local/bin/nvidia-htop
  5. Test it

    nvidia-htop

    You should see the table and nvidia-smi output.


👀 Live monitoring with watch

You can refresh the view every 0.1 seconds:

watch -n 0.1 nvidia-htop

watch now works because ~/.local/bin is in your PATH.

You can even use:

watch -d -n 0.1 nvidia-htop

to highlight changes dynamically (like htop).


⚙️ Troubleshooting

Issue Solution
watch says “command not found” Add export PATH="$HOME/.local/bin:$PATH" to ~/.profile and reload
Table shows only Python but not full args Check /proc/$pid/cmdline permissions — hidepid in /proc may restrict this for other users
Command truncates Increase max=60 inside the script
Slow output The 1-second delay is from nvidia-smi utilization sampling (normal)

🧩 Notes

  • Works on any Linux system with nvidia-smi available.
  • Fully user-space: no root privileges needed.
  • Compatible with bash ≥ 4.0 and zsh.
  • Perfect for cluster users who want a quick overview of who’s using which GPU.

Author: Karol Gotkowski License: MIT Tested on: Ubuntu 22.04, CentOS 8, and RHEL 9 with CUDA ≥ 11.8

#!/usr/bin/env bash
# Enhanced NVIDIA GPU process overview
nvidia-htop() {
# 1) UUID -> index
declare -A GPU_IDX
while IFS=, read -r idx uuid; do
idx="$(echo "$idx" | xargs)"
uuid="$(echo "$uuid" | xargs)"
[[ -n "$uuid" ]] && GPU_IDX["$uuid"]="$idx"
done < <(nvidia-smi --query-gpu=index,uuid --format=csv,noheader)
# 2) Collect entries (compute-apps preferred, pmon fallback)
entries=()
while IFS=, read -r uuid pid pname; do
uuid="$(echo "$uuid" | xargs)"
pid="$(echo "$pid" | xargs)"
pname="$(echo "$pname" | sed 's/^ *//;s/ *$//')"
[[ "$pid" =~ ^[0-9]+$ ]] || continue
entries+=("$uuid,$pid,$pname")
done < <(nvidia-smi --query-compute-apps=gpu_uuid,pid,process_name --format=csv,noheader 2>/dev/null)
if [ "${#entries[@]}" -eq 0 ]; then
# pmon fallback: GPU index, PID, COMMAND...
while read -r line; do
[[ "$line" =~ ^#|^GPU|^$ ]] && continue
gpu=$(awk '{print $1}' <<<"$line")
pid=$(awk '{print $2}' <<<"$line")
cmd=$(awk '{c=""; for(i=11;i<=NF;i++) c=c (i>11?" ":"") $i; print c}' <<<"$line")
[[ "$gpu" =~ ^[0-9]+$ && "$pid" =~ ^[0-9]+$ ]] || continue
entries+=("$gpu,$pid,$cmd")
done < <(nvidia-smi pmon -c 1 2>/dev/null)
fi
# 3) Print table with robust process-name fallback
if [ "${#entries[@]}" -gt 0 ]; then
{
echo "+------+----------+--------------+----------------------+"
echo "| GPU | PID | USER | PROCESS |"
echo "+------+----------+--------------+----------------------+"
for e in "${entries[@]}"; do
gpu_id="${e%%,*}"; rest="${e#*,}"
pid="${rest%%,*}"; pname="${rest#*,}"
# Map UUID -> index
if [[ "$gpu_id" =~ ^GPU- ]]; then
gpu_disp="${GPU_IDX[$gpu_id]:-?}"
else
gpu_disp="${gpu_id}"
fi
user="$(ps -o uname= -p "$pid" 2>/dev/null)"
# If name is missing or truncated by nvidia-smi (starts with "...")
if [ -z "$pname" ] || [ "$pname" = "N/A" ] || [[ "$pname" == \...* ]]; then
if pname="$(cat /proc/$pid/comm 2>/dev/null)"; then :
elif pname="$(ps -o comm= -p "$pid" 2>/dev/null)"; then :
elif pname="$(readlink -f /proc/$pid/exe 2>/dev/null)"; then
pname="$(basename "$pname")"
else
pname="?"
fi
else
pname="$(basename "$pname")"
fi
printf "| %-4s | %-8s | %-12s | %-20s |\n" "$gpu_disp" "$pid" "$user" "$pname"
done
echo "+------+----------+--------------+----------------------+"
} | sed 's/ *$//'
else
echo "No GPU processes found."
fi
# Classic nvidia-smi (adds ~1s due to utilization sampling)
nvidia-smi
}
nvidia-htop "$@"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment