Skip to content

Instantly share code, notes, and snippets.

@mibli
Last active July 3, 2025 14:38
Show Gist options
  • Select an option

  • Save mibli/21fd6ebc7364e1f65bdeecad481d62b1 to your computer and use it in GitHub Desktop.

Select an option

Save mibli/21fd6ebc7364e1f65bdeecad481d62b1 to your computer and use it in GitHub Desktop.
Use LF as a Directory Tree Diff Walker

Rationale

Recently I've found myself in need to reinstall the linux system, to change MBR partition to GPT. Before I do that though I need to find files I've had configured for my system, and I could use a bunch of applications for that, none of them though offer a holistic approach. So how about a diff walker (similiar to disk-usage walkers)? I could set up a directory with bare system and investigate what I've added manually. However options for that are limited, but it was easy to implement a simple version of it myself. With use of lf and recently added feature: on-load.

How it works

  • Basically we generate a cache with differences between two directories.
  • Then we use on-load to set custom display information in the view.
  • To start the process we will use an lf command.
  • We change the previewer to diff previewer while in diff mode

This will allow us to pull the cached differences and investigate what's been changed, however the use applications are many.

Preview

user@machine:~/r/i3switch/.git                                                                       
  .git                        119%    -    branches                                         100%    -  
  .github                     100%    -    filter-repo                                      100%    -  
  cpp                         100%    -    git-cliff-core                                   100%    -  
  docs                        100%    -    hooks                                            0%      -  
  python                      100%    -    info                                             50%     -  
  rust                        100%    -    logs                                             136%    -  
  scripts                     100%    -    modules                                          23%     -  
  .clang-format               0%    32B    objects                                          100%    -  
  .gitignore                  120% 213B    refs                                             122%    -  
  Makefile                    100% 799B    COMMIT_EDITMSG                                   200%  46B  
  README.md                   188% 3.5K    config                                           145% 279B  
  tags                        100% 3.1M    description                                      0%    73B  
                                           fast_import_crash_2704499                        100% 735B  
                                           fast_import_crash_3105971                        100% 465B  
                                           FETCH_HEAD                                       400%  87B  
                                           HEAD                                             200%  33B  
                                           index                                            100% 7.8K  
                                           ORIG_HEAD                                        200%  41B  
                                           packed-refs                                      121% 854B  

drwxr-xr-x 11 user user 4.0K Thu Jul  3 13:09:24 2025                                1   1/12

Implementation

Diff caching and previewing scripts

The scripts are attached. Caching script takes base directory path and reference directory path, generates cache in ~/.cache/lf/diff_cache.txt. We will put it in ~/.config/lf/gen_diff_cache.sh. Previewer script will display the preview of the diff between the files, so we can glance at what has been changes. We will put it in the same directory ~/.config/lf/diff_previewer

LF function

Now we add a function that will start the caching process to lfrc (typically ~/.config/lf/lfrc) and enter diff mode, replacing default previewer and a function that will stop the diff mode restoring previewer.

set user_diff_mode false
set user_diff_base_path ""
set user_diff_ref_path ""
set user_diff_cache_path "$HOME/.cache/lf/diff_cache.txt"
set user_diff_previewer ~/.config/lf/diff_previewer

cmd start_diff ${{
    set -f
    base_path="${f%/*}"
    ref_path="$1"
    if [ $# -ne 1 ]; then
        printf "Specify reference dir: "
        read ref_path
    fi
    [ -z "$ref_path" ] && ref_path="$(pwd)"
    ref_path="$(realpath "$ref_path")"
    [ -d "$ref_path" ] || return 1

    echo "You're about to start a diff session with:"
    echo "base path: $base_path"
    echo "reference path: $ref_path"
    printf "Generating cache will take a while, continue? [y/n]" "$ref_path" "$base_path"
    read ans
    [ "$ans" != "y" ] && return 1

    cmds="set user_diff_mode true; "
    cmds+="set user_diff_base_path '$base_path'; "
    cmds+="set user_diff_ref_path '$ref_path'; "
    cmds+="set user_diff_previewer '$lf_previewer'; "
    cmds+="set previewer '$lf_user_diff_previewer'; "
    cmds+="set hidden; reload"

    lf -remote "send $id :$cmds"
    $HOME/.config/lf/gen_diff_cache.sh "$base_path" "$ref_path"
}}

cmd stop_diff ${{
    set -f
    cmds="set user_diff_mode false; "
    cmds+="set user_diff_base_path ''; "
    cmds+="set user_diff_ref_path ''; "
    cmds+="set user_diff_previewer '$lf_previewer'; "
    cmds+="set previewer '$lf_user_diff_previewer'; "
    cmds+="reload"

    lf -remote "send $id :$cmds"
}}

The start_diff function will take the current directory as a base directory and take an argument or ask the user about the reference directory. Then it will update lf user options to enable diff mode and to store the reference path we provided. It will also unhide files to make sure we don't miss anything, and trigger reload so on-load will reload the custom fields.

On-load hook

Finally when loading the directory, we will search the files in the diff cache and send the custom data to lf so it can display it in the view.

But before this works, make sure You have custom in the info option, eg:

set info custom:size # adds custom info and file size to the view

Here's the hook that we use:

cmd on-load &{{
    if [ "$lf_user_diff_mode" = true ]; then
        if ! [[ "$1" = "$lf_user_diff_base_path"* ]]; then
            return 0 # let's not waste time if the base path is not correct
        fi
        diff_cache_path=$( envsubst <<<"$lf_user_diff_cache_path" )
        cmds=""
        for file in "$@"; do
            diff_line="$(grep "^$file " "$diff_cache_path")"
            [ -n "$diff_line" ] || continue
            percent="${diff_line#* }"
            if [ "$percent" -eq "0" ]; then
                percent="\033[1;32m$percent%\033[0m" # print in green
            elif [ "$percent" -le "50" ]; then
                percent="\033[1;33m$percent%\033[0m" # print in yellow
            else
                percent="\033[1;31m$percent%\033[0m" # print in red
            fi
            cmds+="addcustominfo '$file' \"$percent\";"
        done
        lf -remote "send $id :$cmds"
        return 0
    fi
    # You can keep your on-load things below, that will not trigger in diff mode
}}

Finalizing

To start diff session we will use :start_diff ../ref_dir and confirm that we want to continue. When we're done we can just input :stop_diff and we will be back to normal mode. Neat!

#!/bin/bash
if [ -n "$lf_user_diff_mode" ] && [ "$lf_user_diff_mode" = "true" ]; then
rel_path=${1#"$lf_user_diff_base_path/"}
[ -z "$lf_user_diff_base_path" ] && exit 1
[ -z "$lf_user_diff_ref_path" ] && exit 1
if [ ! -e "$lf_user_diff_ref_path/$rel_path" ]; then
echo "File not present in reference path:"
echo "$lf_user_diff_ref_path/$rel_path"
else
if diff -u --color=always "$1" "$lf_user_diff_ref_path/$rel_path"; then
echo "Files are identical."
fi
fi
fi
#!/bin/bash
CACHE_DIR="$HOME/.cache/lf"
CACHE_LIST_FILE="$CACHE_DIR/diff_cache_list.txt"
CACHE_FILE="$CACHE_DIR/diff_cache.txt"
BASE_DIR=""
REF_DIR=""
VERBOSE=false
# Ensure cache directory exists
mkdir -p "$CACHE_DIR"
match_type() {
local path1="$1"
local path2="$2"
local match="$3"
file "$path1" | grep -q "$match" && file "$path2" | grep -q "$match"
}
# Function to calculate file difference percentage
calc_file_diff() {
local base_dir="$1"
local ref_dir="$2"
local rel_path="$3"
if match_type "$base_dir/$rel_path" "$ref_dir/$rel_path" "text"; then
# Text file: use diff and estimate difference based on line changes
total_lines=$(wc -l < "$base_dir/$rel_path")
[ "$total_lines" -eq 0 ] && total_lines=1
diff_lines=$(comm -3 <(cat "$base_dir/$rel_path") <(cat "$ref_dir/$rel_path") 2>/dev/null | wc -l)
echo $(( (diff_lines * 100) / total_lines ))
elif match_type "$base_dir/$rel_path" "$ref_dir/$rel_path" "symbolic link"; then
# Symlink: compare target paths
if [ "$(readlink "$base_dir/$rel_path")" = "$(readlink "$ref_dir/$rel_path")" ]; then
echo "0" # Identical
else
echo "100" # Different
fi
else
# Binary file: compare checksums
if [ "$(sha256sum "$base_dir/$rel_path" | cut -d' ' -f1)" = "$(sha256sum "$ref_dir/$rel_path" | cut -d' ' -f1)" ]; then
echo "0" # Identical
else
echo "100" # Different
fi
fi
}
# Function to calculate directory difference (average of contents)
calc_dir_diff() {
local base_dir="$1"
local ref_dir="$2"
local rel_path="$3"
# We assume we iterated over all files in the directory because of reverse sorting
# of the find command. So instead of calculating the diff for each file,
# we calculate the average diff percentage of all files.
file_count=0
total_diff=0
while read -r path percent; do
file_count=$((file_count + 1))
total_diff=$((total_diff + percent))
done < <(grep -E "^$base_dir/$rel_path/[^/]+$" "$CACHE_FILE")
if [ $file_count -eq 0 ]; then
echo "100" # No files in directory
return
fi
echo $((total_diff / file_count))
}
# Function to calculate difference using comm for path comparison
calc_diff() {
local base_dir="$1"
local ref_dir="$2"
local rel_path="$3"
if [ -f "$base_dir/$rel_path" ] && [ -f "$ref_dir/$rel_path" ]; then
calc_file_diff "$base_dir" "$ref_dir" "$rel_path"
elif [ -d "$base_dir/$rel_path" ] && [ -d "$ref_dir/$rel_path" ]; then
calc_dir_diff "$base_dir" "$ref_dir" "$rel_path"
else
echo "100" # Path doesn't exist in baseline
fi
}
# Main logic
HELP="
Usage: $0 [OPTIONS] BASELINE_DIR REFERENCE_DIR
Options:
-h Show this help message
-o FILE Output file for diff cache (default: $CACHE_FILE)
-q Quiet mode, suppress progress output
-v Verbose mode, show detailed output
"
while getopts ho:qv opt; do
case "$opt" in
o) CACHE_FILE="$OPTARG" ;;
h) echo "$HELP"; exit 0 ;;
q) QUIET=true ;;
v) VERBOSE=true ;;
\?) echo "Error: Invalid option -$OPTARG seek help with '$0 -h'"; exit 1 ;;
esac
done
shift $((OPTIND - 1))
if [ $# -eq 2 ]; then
BASE_DIR="$1"
shift
fi
REF_DIR="$1"
shift
if [ "$#" -ne 0 ]; then
echo "Error: Unexpected arguments: $*"
echo "Seek help with '$0 -h'"
exit 1
fi
[ -z "$BASE_DIR" ] && { echo "Error: Base directory required"; exit 1; }
[ ! -d "$BASE_DIR" ] && { echo "Error: Base directory $BASE_DIR does not exist"; exit 1; }
[ -z "$REF_DIR" ] && { echo "Error: Reference directory required"; exit 1; }
[ ! -d "$REF_DIR" ] && { echo "Error: Baseline directory $REF_DIR does not exist"; exit 1; }
# Initialize cache
echo "" > "$CACHE_FILE"
# Combine paths from system and baseline using find
prune_dirs=( "/proc" "/sys" "/dev" "/run" "/var/cache" )
prune_args=()
for dir in "${prune_dirs[@]}"; do
prune_args+=("-path" "$dir" -prune -o)
done
{
cd "$BASE_DIR" || exit 1
find ./ "${prune_args[@]}" -print 2>/dev/null
cd "$REF_DIR" || exit 1
find ./ "${prune_args[@]}" -print 2>/dev/null
} | sort -ru > "$CACHE_LIST_FILE"
total_files=$(wc -l < "$CACHE_LIST_FILE")
current_file=0
while read -r path; do
rel_path="${path#./}"
diff_percent=$(calc_diff "$BASE_DIR" "$REF_DIR" "$rel_path")
if $VERBOSE; then
echo "Processing $rel_path: $diff_percent%"
elif ! $QUIET; then
current_file=$((current_file + 1))
echo -ne "Processing $current_file/$total_files files\r"
fi
echo "$BASE_DIR/$rel_path $diff_percent" >> "$CACHE_FILE"
done < "$CACHE_LIST_FILE"
rm "$CACHE_LIST_FILE"
chown "$USER:$USER" "$CACHE_FILE"
echo "Diff cache generated at $CACHE_FILE"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment