Skip to content

Instantly share code, notes, and snippets.

@terrisgit
Last active July 29, 2023 16:11
Show Gist options
  • Select an option

  • Save terrisgit/3ebe475c09c2a405a4c3f0add954373d to your computer and use it in GitHub Desktop.

Select an option

Save terrisgit/3ebe475c09c2a405a4c3f0add954373d to your computer and use it in GitHub Desktop.
A shell script that removes the beginning blocks of a file efficiently
#!/bin/sh
# Purpose
#
# This is a reverse truncation script intended for but not limited to log
# files. The oldest log entries are discarded.
#
# This script operates on bytes only -- not lines or even characters. It
# alters the sizes of specified files in place efficiently, but with a heavy
# hand. It removes the beginning bytes from a file until it is roughly the
# specified size. The file size is never smaller than the specified size but
# it may be larger because fallocate only works with whole numbers of disk
# blocks.
#
# Requirements
#
# This script only works on Linux filesystems that are supported
# by a recent version of the fallocate command. See
# https://man7.org/linux/man-pages/man2/fallocate.2.html.
#
# Distros that ship fallocate out of the box:
# - Amazon Linux
# Distros that probably require fallocate to be installed manually:
# - Alpine
# - See https://www.thegeekdiary.com/fallocate-command-not-found/
# Distros that definitely won't work:
# - MacOS (brew install util-linux doesn't install fallocate despite what the Internet says)
#
# Usage
#
# 1. Threshold: Files exceeding this size, in multiples of MiB, are
# reverse-truncated
# 2. Target: Files are truncated to this size, in multiples of MiB
# 3, 4, 5, .... The paths of files to modify
#
# Example
#
# $> ./truncate.sh 20 10 foo.txt
# If foo.txt is larger than 20 MiB, it will be modified to contain roughly
# the last 10 MiB of foo.txt.
threshold=$(($1)); shift
truncate=$(($1)); shift
if [[ $threshold -lt 1 ]]
then
>&2 echo "Usage: threshold truncate file1 [file2 ...]"
exit 1
fi
if [[ $truncate -lt 1 ]]
then
>&2 echo "'truncate' ($truncate) must be a positive integer"
exit 1
fi
if [[ $truncate -gt $threshold ]]
then
>&2 echo "'threshold' ($threshold) must be greater than or equal to 'truncate' (${truncate})"
exit 1
fi
threshold=$(($threshold*1048576))
truncate=$(($truncate*1048576))
for var in "$@"
do
bytes=$((`wc -c < "$var"`))
if [[ $bytes -gt $threshold ]]
then
removeBytes=$(($bytes-$truncate))
# Determine number of blocks to remove
# This doesn't work on MacOS
blockSize=`stat -f -c %S "$var"`
removeKiB=$(($removeBytes/$blockSize))
if [[ $removeKiB -gt 0 ]]
then
removeKiB=$(($removeKiB*$blockSize/1024))
echo "fallocate -c -l ${removeKiB}KiB $var"
# This doesn't work on MacOS
fallocate -c -l ${removeKiB}KiB "$var"
if [[ $? -lt 1 ]]
then
newbytes=`wc -c < "$var"`
echo "$var: Removed $removeKiB KiB. Old size: $bytes bytes. New size: $newbytes bytes."
fi
fi
fi
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment