Skip to content

Instantly share code, notes, and snippets.

@thesuhu
Last active November 16, 2025 06:11
Show Gist options
  • Select an option

  • Save thesuhu/a881c7869bce973873d6fddf7c0a6cc9 to your computer and use it in GitHub Desktop.

Select an option

Save thesuhu/a881c7869bce973873d6fddf7c0a6cc9 to your computer and use it in GitHub Desktop.
File Cleanup Scripts
#!/bin/bash
# SAFE Cleanup files older than specified days
BASE_DIR="/sftp1/data"
DAYS_OLD=180
DATE=$(date '+%Y-%m-%d %H:%M:%S')
VERBOSE=1
# SAFETY FEATURE: Set to 1 for actual deletion, 0 for test mode
DELETE_MODE=0
if [ "$DELETE_MODE" -eq 0 ]; then
echo "[$DATE] *** TEST MODE - NO FILES WILL BE DELETED ***"
echo "[$DATE] Set DELETE_MODE=1 to actually delete files"
else
echo "[$DATE] *** DANGER: DELETE MODE ACTIVE ***"
fi
echo "[$DATE] Scanning for files older than $DAYS_OLD days"
TOTAL_FILES=0
TOTAL_BYTES=0
for DIR in "$BASE_DIR/csv" "$BASE_DIR/xls"; do
if [ -d "$DIR" ]; then
FOLDER_NAME=$(basename "$DIR")
COUNT=$(find "$DIR" -type f -mtime +$DAYS_OLD 2>/dev/null | wc -l)
if [ "$COUNT" -gt 0 ]; then
# Calculate total size
FOLDER_SIZE=$(find "$DIR" -type f -mtime +$DAYS_OLD -exec ls -ln {} \; 2>/dev/null | awk '{sum+=$5} END {print sum+0}')
# Convert to human readable
if [ $FOLDER_SIZE -ge 1073741824 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fGB\", $FOLDER_SIZE/1073741824}")
elif [ $FOLDER_SIZE -ge 1048576 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fMB\", $FOLDER_SIZE/1048576}")
elif [ $FOLDER_SIZE -ge 1024 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fKB\", $FOLDER_SIZE/1024}")
else
HUMAN_SIZE="${FOLDER_SIZE}B"
fi
echo "[$DATE] $FOLDER_NAME: Found $COUNT files ($HUMAN_SIZE)"
if [ "$VERBOSE" -eq 1 ]; then
echo "[$DATE] $FOLDER_NAME oldest files sample (showing first 5):"
find "$DIR" -type f -mtime +$DAYS_OLD -exec ls -lht {} \; 2>/dev/null | head -5 | while read -r line; do
echo " $line"
done
fi
# Actual deletion only if DELETE_MODE=1
if [ "$DELETE_MODE" -eq 1 ]; then
find "$DIR" -type f -mtime +$DAYS_OLD -delete 2>/dev/null
echo "[$DATE] $FOLDER_NAME: DELETED $COUNT files ($HUMAN_SIZE freed)"
else
echo "[$DATE] $FOLDER_NAME: Would delete $COUNT files ($HUMAN_SIZE)"
fi
TOTAL_FILES=$((TOTAL_FILES + COUNT))
TOTAL_BYTES=$((TOTAL_BYTES + FOLDER_SIZE))
else
echo "[$DATE] $FOLDER_NAME: No old files found"
fi
else
echo "[$DATE] Directory not found: $DIR"
fi
done
# Total summary
if [ $TOTAL_BYTES -ge 1073741824 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fGB\", $TOTAL_BYTES/1073741824}")
elif [ $TOTAL_BYTES -ge 1048576 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fMB\", $TOTAL_BYTES/1048576}")
elif [ $TOTAL_BYTES -ge 1024 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fKB\", $TOTAL_BYTES/1024}")
else
TOTAL_HUMAN="${TOTAL_BYTES}B"
fi
echo "[$DATE] TOTAL: $TOTAL_FILES files, $TOTAL_HUMAN"
if [ "$DELETE_MODE" -eq 0 ]; then
echo "[$DATE] *** TEST COMPLETED - Change DELETE_MODE=1 to actually delete ***"
else
echo "[$DATE] Cleanup completed - $TOTAL_FILES files deleted"
fi
#!/bin/bash
# SAFE Cleanup files - Keep last 2 weeks (14 days)
BASE_DIR="/sftp1/data"
DAYS_OLD=14
DATE=$(date '+%Y-%m-%d %H:%M:%S')
VERBOSE=1
# SAFETY FEATURE: Set to 1 for actual deletion, 0 for test mode
DELETE_MODE=0
if [ "$DELETE_MODE" -eq 0 ]; then
echo "[$DATE] *** TEST MODE - NO FILES WILL BE DELETED ***"
echo "[$DATE] Set DELETE_MODE=1 to actually delete files"
else
echo "[$DATE] *** DANGER: DELETE MODE ACTIVE ***"
fi
echo "[$DATE] Scanning for files older than $DAYS_OLD days (keeping last 2 weeks)"
TOTAL_FILES=0
TOTAL_BYTES=0
for DIR in "$BASE_DIR/csv" "$BASE_DIR/xls"; do
if [ -d "$DIR" ]; then
FOLDER_NAME=$(basename "$DIR")
COUNT=$(find "$DIR" -type f -mtime +$DAYS_OLD 2>/dev/null | wc -l)
if [ "$COUNT" -gt 0 ]; then
# Calculate total size
FOLDER_SIZE=$(find "$DIR" -type f -mtime +$DAYS_OLD -exec ls -ln {} \; 2>/dev/null | awk '{sum+=$5} END {print sum+0}')
# Convert to human readable
if [ $FOLDER_SIZE -ge 1073741824 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fGB\", $FOLDER_SIZE/1073741824}")
elif [ $FOLDER_SIZE -ge 1048576 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fMB\", $FOLDER_SIZE/1048576}")
elif [ $FOLDER_SIZE -ge 1024 ]; then
HUMAN_SIZE=$(awk "BEGIN {printf \"%.2fKB\", $FOLDER_SIZE/1024}")
else
HUMAN_SIZE="${FOLDER_SIZE}B"
fi
echo "[$DATE] $FOLDER_NAME: Found $COUNT files ($HUMAN_SIZE)"
if [ "$VERBOSE" -eq 1 ]; then
echo "[$DATE] $FOLDER_NAME oldest files sample (showing first 5):"
find "$DIR" -type f -mtime +$DAYS_OLD -exec ls -lht {} \; 2>/dev/null | head -5 | while read -r line; do
echo " $line"
done
fi
# Actual deletion only if DELETE_MODE=1
if [ "$DELETE_MODE" -eq 1 ]; then
find "$DIR" -type f -mtime +$DAYS_OLD -delete 2>/dev/null
echo "[$DATE] $FOLDER_NAME: DELETED $COUNT files ($HUMAN_SIZE freed)"
else
echo "[$DATE] $FOLDER_NAME: Would delete $COUNT files ($HUMAN_SIZE)"
fi
TOTAL_FILES=$((TOTAL_FILES + COUNT))
TOTAL_BYTES=$((TOTAL_BYTES + FOLDER_SIZE))
else
echo "[$DATE] $FOLDER_NAME: No old files found"
fi
else
echo "[$DATE] Directory not found: $DIR"
fi
done
# Total summary
if [ $TOTAL_BYTES -ge 1073741824 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fGB\", $TOTAL_BYTES/1073741824}")
elif [ $TOTAL_BYTES -ge 1048576 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fMB\", $TOTAL_BYTES/1048576}")
elif [ $TOTAL_BYTES -ge 1024 ]; then
TOTAL_HUMAN=$(awk "BEGIN {printf \"%.2fKB\", $TOTAL_BYTES/1024}")
else
TOTAL_HUMAN="${TOTAL_BYTES}B"
fi
echo "[$DATE] TOTAL: $TOTAL_FILES files, $TOTAL_HUMAN"
if [ "$DELETE_MODE" -eq 0 ]; then
echo "[$DATE] *** TEST COMPLETED - Change DELETE_MODE=1 to actually delete ***"
else
echo "[$DATE] Cleanup completed - $TOTAL_FILES files deleted"
fi

File Cleanup Scripts

Automated cleanup scripts for removing old files from /sftp1/data/csv and /sftp1/data/xls directories.

Available Scripts

1. cleanup_files.sh - 6 Months Retention

Removes files older than 180 days (6 months).

Use case: Regular maintenance to keep only recent data.

2. cleanup_keep2weeks.sh - 2 Weeks Retention

Removes files older than 14 days (2 weeks), keeping only the last 2 weeks of data.

Use case: Aggressive cleanup when disk space is critical.

Safety Features

Both scripts include built-in safety mechanisms:

  • Test Mode by Default (DELETE_MODE=0)

    • Shows what would be deleted WITHOUT actually deleting
    • Displays file count and total size
    • Shows sample of oldest files
  • Delete Mode (DELETE_MODE=1)

    • Actually deletes the files
    • Only activate after reviewing test mode output

Usage

Step 1: Test Run (Safe)

# Run in test mode (default, safe)
./cleanup_files.sh

Expected output:

[2025-11-16 10:30:00] *** TEST MODE - NO FILES WILL BE DELETED ***
[2025-11-16 10:30:00] csv: Found 611 files (1.25GB)
[2025-11-16 10:30:00] csv: Would delete 611 files (1.25GB)
[2025-11-16 10:30:00] TOTAL: 935 files, 2.10GB

Step 2: Review Output

Check the test output carefully:

  • Verify file counts make sense
  • Check total size to be deleted
  • Review sample of oldest files shown

Step 3: Execute Deletion (After Review)

# Edit script and change:
DELETE_MODE=0  →  DELETE_MODE=1

# Then run:
./cleanup_files.sh

Setup for Cron Job

Installation

# Copy scripts to system location
sudo cp cleanup_files.sh /usr/local/bin/
sudo cp cleanup_keep2weeks.sh /usr/local/bin/
sudo chmod +x /usr/local/bin/cleanup_*.sh

Cron Schedule Examples

# Edit crontab
crontab -e

# Run cleanup_files.sh daily at 2 AM
0 2 * * * /usr/local/bin/cleanup_files.sh >> /var/log/cleanup-6months.log 2>&1

# Run cleanup_keep2weeks.sh weekly on Sunday at 3 AM
0 3 * * 0 /usr/local/bin/cleanup_keep2weeks.sh >> /var/log/cleanup-2weeks.log 2>&1

IMPORTANT: Remember to set DELETE_MODE=1 in the script before scheduling in cron!

Configuration Variables

Each script can be customized by editing these variables:

BASE_DIR="/sftp1/data"        # Base directory path
DAYS_OLD=180                   # Retention period (180 or 14)
VERBOSE=1                      # 1=show details, 0=minimal output
DELETE_MODE=0                  # 0=test mode, 1=actual deletion

Output Information

Both scripts display:

  • Total files found matching age criteria
  • Total size in human-readable format (KB/MB/GB)
  • Sample of oldest files (when VERBOSE=1)
  • Per-folder breakdown
  • Overall summary

File Handling

Supports:

  • Files with spaces in names
  • Special characters in filenames
  • Nested subdirectories
  • Human-readable size formats

⚠️ Important Notes:

  • Uses file modification time (mtime)
  • -mtime +N means "older than N days"
  • Always test first with DELETE_MODE=0
  • Check logs regularly when running via cron

Troubleshooting

No files found but expecting some?

  • Check directory paths are correct
  • Verify file modification dates: ls -lht /sftp1/data/csv/ | head
  • Ensure script has read permissions

Permission denied errors?

  • Run with appropriate user permissions
  • Check directory and file ownership

Size showing 0B?

  • Files may be too new (within retention period)
  • Verify with: find /sftp1/data/csv -type f -mtime +180 -ls

Best Practices

  1. Always test first - Run with DELETE_MODE=0 before enabling deletion
  2. Review logs regularly - Check cron job logs for issues
  3. Keep backups - Ensure important data is backed up elsewhere
  4. Monitor disk space - Verify cleanup is freeing expected space
  5. Document retention policy - Know why you chose 6 months or 2 weeks

Recovery

If files are accidentally deleted:

  • Check system backups immediately
  • Stop further write operations to the disk
  • Use recovery tools: testdisk, photorec, extundelete
  • Contact system administrator

License

Internal use only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment