Delete Files Larger Than 100MB: Safe Disk Space Management with Find

Learn how to safely find and delete large files using the find command with -exec. Master disk space management techniques and avoid accidental data loss.

Know More Team
January 27, 2025
3 min read
LinuxFile ManagementDisk SpaceFind CommandSystem Administration

Delete Files Larger Than 100MB: Safe Disk Space Management with Find

Disk space management is a critical aspect of Linux system administration. When systems run low on storage, administrators need to identify and remove large files to free up space. The find command with -exec provides a powerful and flexible way to locate and delete files based on size criteria. However, this operation requires careful consideration to avoid accidental data loss.

Understanding File Size Management

Why Large File Cleanup is Important

Large files can accumulate over time and cause several issues:

  • Disk space exhaustion - System becomes unresponsive when disk is full
  • Performance degradation - File system operations slow down
  • Backup failures - Large files can cause backup timeouts
  • Application crashes - Services may fail when they can't write to disk
  • Log file growth - Application logs can grow indefinitely

Common Locations for Large Files

Large files typically accumulate in:

  • /var/log/ - Application and system logs
  • /tmp/ - Temporary files
  • /var/cache/ - Application cache files
  • /home/ - User files and downloads
  • /opt/ - Third-party applications
  • /usr/local/ - Locally installed software

Basic File Deletion with Find

Simple Find and Delete Command

# Find and delete files larger than 100MB
find /path/to/directory -type f -size +100M -exec rm -f {} \;

Command Breakdown

  • find /path/to/directory - Start search from specified directory
  • -type f - Only look for files (not directories)
  • -size +100M - Find files larger than 100MB
  • -exec rm -f {} \; - Execute rm -f on each found file
    • {} is replaced by the file path
    • \; ends the command

Example Usage

# Delete large files in /var/log
find /var/log -type f -size +100M -exec rm -f {} \;

# Delete large files in user home directories
find /home -type f -size +100M -exec rm -f {} \;

# Delete large files in current directory
find . -type f -size +100M -exec rm -f {} \;

Safe File Deletion Practices

1. Preview Without Deleting (Dry Run)

Always preview files before deletion:

# List files larger than 100MB without deleting
find /path/to/directory -type f -size +100M -exec ls -lh {} \;

# Example output:
# -rw-r--r-- 1 root root 104M Jan 15 10:30 /var/log/large.log
# -rw-r--r-- 1 root root 209M Jan 15 10:30 /var/log/huge.log

2. Use -ok for Confirmation

Interactive confirmation for each file:

# Ask for confirmation before deleting each file
find /var/log -type f -size +100M -ok rm {} \;

# Example interaction:
# < rm ... /var/log/large.log > ? y
# < rm ... /var/log/huge.log > ? n

3. Log Deletions

Keep a record of deleted files:

# Log deletions to a file
find /var/log -type f -size +100M -exec rm {} \; -print > deleted_files.log

# Or use tee to see output and log it
find /var/log -type f -size +100M -exec rm {} \; -print | tee deleted_files.log

4. Exclude Important Directories

Protect critical directories:

# Exclude important directories
find / -type f -size +100M -not -path "/home/*" -not -path "/opt/*" -exec rm {} \;

# Exclude specific file types
find /var/log -type f -size +100M -not -name "*.conf" -exec rm {} \;

Advanced File Deletion Techniques

Using xargs for Better Performance

# More efficient for large numbers of files
find /var/log -type f -size +100M -print0 | xargs -0 rm -f

# With confirmation
find /var/log -type f -size +100M -print0 | xargs -0 -p rm -f

Conditional Deletion Based on Age

# Delete files larger than 100MB and older than 30 days
find /var/log -type f -size +100M -mtime +30 -exec rm {} \;

# Delete files larger than 100MB and older than 7 days
find /var/log -type f -size +100M -mtime +7 -exec rm {} \;

Delete by File Extension

# Delete large log files
find /var/log -type f -size +100M -name "*.log" -exec rm {} \;

# Delete large temporary files
find /tmp -type f -size +100M -name "*.tmp" -exec rm {} \;

Comprehensive Disk Cleanup Script

Advanced Cleanup Script

#!/bin/bash
# disk_cleanup.sh

# Configuration
LOG_FILE="/var/log/disk_cleanup.log"
DRY_RUN=false
SIZE_THRESHOLD="100M"
DAYS_THRESHOLD=30

# Function to log actions
log_action() {
    echo "[$(date)] $1" | tee -a "$LOG_FILE"
}

# Function to check disk usage
check_disk_usage() {
    local path="$1"
    local usage=$(df "$path" | awk 'NR==2 {print $5}' | sed 's/%//')
    echo "$usage"
}

# Function to find large files
find_large_files() {
    local path="$1"
    local size="$2"
    
    log_action "Searching for files larger than $size in $path"
    
    if [ "$DRY_RUN" = true ]; then
        find "$path" -type f -size +"$size" -exec ls -lh {} \;
    else
        find "$path" -type f -size +"$size" -exec rm {} \; -print
    fi
}

# Function to clean log files
clean_log_files() {
    local log_dir="$1"
    local size="$2"
    local days="$3"
    
    log_action "Cleaning log files in $log_dir"
    
    # Find and delete large log files
    find "$log_dir" -type f -size +"$size" -name "*.log" -mtime +"$days" -exec rm {} \; -print
    
    # Find and delete old log files
    find "$log_dir" -type f -name "*.log.*" -mtime +"$days" -exec rm {} \; -print
    
    # Find and delete old compressed logs
    find "$log_dir" -type f \( -name "*.gz" -o -name "*.bz2" -o -name "*.xz" \) -mtime +"$days" -exec rm {} \; -print
}

# Function to clean temporary files
clean_temp_files() {
    local temp_dir="$1"
    local size="$2"
    
    log_action "Cleaning temporary files in $temp_dir"
    
    # Find and delete large temporary files
    find "$temp_dir" -type f -size +"$size" -mtime +7 -exec rm {} \; -print
    
    # Find and delete old temporary files
    find "$temp_dir" -type f -mtime +7 -exec rm {} \; -print
}

# Function to clean cache files
clean_cache_files() {
    local cache_dir="$1"
    local size="$2"
    
    log_action "Cleaning cache files in $cache_dir"
    
    # Find and delete large cache files
    find "$cache_dir" -type f -size +"$size" -mtime +30 -exec rm {} \; -print
}

# Function to generate cleanup report
generate_report() {
    local report_file="/tmp/disk_cleanup_report_$(date +%Y%m%d_%H%M%S).txt"
    
    echo "Disk Cleanup Report - $(date)" > "$report_file"
    echo "=================================" >> "$report_file"
    echo "" >> "$report_file"
    
    # Disk usage before cleanup
    echo "Disk Usage Before Cleanup:" >> "$report_file"
    df -h >> "$report_file"
    echo "" >> "$report_file"
    
    # Large files found
    echo "Large Files Found:" >> "$report_file"
    find / -type f -size +"$SIZE_THRESHOLD" 2>/dev/null | head -20 >> "$report_file"
    echo "" >> "$report_file"
    
    # Top directories by size
    echo "Top Directories by Size:" >> "$report_file"
    du -h / 2>/dev/null | sort -hr | head -10 >> "$report_file"
    
    log_action "Cleanup report generated: $report_file"
}

# Main cleanup function
main() {
    log_action "Starting disk cleanup process"
    
    # Check if running as root
    if [ "$EUID" -ne 0 ]; then
        log_action "ERROR: This script must be run as root"
        exit 1
    fi
    
    # Generate initial report
    generate_report
    
    # Clean different directories
    clean_log_files "/var/log" "$SIZE_THRESHOLD" "$DAYS_THRESHOLD"
    clean_temp_files "/tmp" "$SIZE_THRESHOLD"
    clean_cache_files "/var/cache" "$SIZE_THRESHOLD"
    
    # Find and delete large files in specific locations
    find_large_files "/var/log" "$SIZE_THRESHOLD"
    find_large_files "/tmp" "$SIZE_THRESHOLD"
    find_large_files "/var/cache" "$SIZE_THRESHOLD"
    
    # Generate final report
    generate_report
    
    log_action "Disk cleanup process completed"
}

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --dry-run)
            DRY_RUN=true
            shift
            ;;
        --size)
            SIZE_THRESHOLD="$2"
            shift 2
            ;;
        --days)
            DAYS_THRESHOLD="$2"
            shift 2
            ;;
        --help)
            echo "Usage: $0 [--dry-run] [--size SIZE] [--days DAYS]"
            echo "  --dry-run    Show what would be deleted without actually deleting"
            echo "  --size       Size threshold (default: 100M)"
            echo "  --days       Age threshold in days (default: 30)"
            exit 0
            ;;
        *)
            echo "Unknown option: $1"
            exit 1
            ;;
    esac
done

# Run main function
main

File Size Units and Examples

Size Unit Options

# Different size units
find /var/log -type f -size +100M    # Megabytes
find /var/log -type f -size +1G      # Gigabytes
find /var/log -type f -size +100k    # Kilobytes
find /var/log -type f -size +100c    # Bytes

Size Comparison Examples

# Files larger than 100MB
find /var/log -type f -size +100M -exec ls -lh {} \;

# Files between 50MB and 100MB
find /var/log -type f -size +50M -size -100M -exec ls -lh {} \;

# Files exactly 100MB
find /var/log -type f -size 100M -exec ls -lh {} \;

Safety and Best Practices

1. Always Test First

# Test with -exec ls -lh to see what would be deleted
find /var/log -type f -size +100M -exec ls -lh {} \;

# Test with -ok for interactive confirmation
find /var/log -type f -size +100M -ok rm {} \;

2. Use Specific Paths

# Good: Specific directory
find /var/log -type f -size +100M -exec rm {} \;

# Bad: Root directory (too broad)
find / -type f -size +100M -exec rm {} \;

3. Exclude Important Files

# Exclude configuration files
find /var/log -type f -size +100M -not -name "*.conf" -exec rm {} \;

# Exclude specific directories
find /var -type f -size +100M -not -path "/var/lib/*" -exec rm {} \;

4. Monitor Disk Usage

# Check disk usage before and after
df -h

# Monitor specific directory
du -sh /var/log

Common Pitfalls and Solutions

1. Accidental Deletion

Problem: Deleting important files Solution: Always test with -exec ls -lh first and use -ok for confirmation

2. Permission Issues

Problem: Permission denied errors Solution: Run with appropriate permissions or use sudo

# Run with sudo
sudo find /var/log -type f -size +100M -exec rm {} \;

Problem: Deleting symbolic links instead of targets Solution: Use -type f to avoid directories and links

4. Network Filesystems

Problem: Slow performance on network filesystems Solution: Use -maxdepth to limit search depth

# Limit search depth
find /var/log -maxdepth 2 -type f -size +100M -exec rm {} \;

Conclusion

File deletion using find with -exec is a powerful tool for disk space management, but it requires careful handling to avoid data loss. Key principles include:

  1. Always test first - Use -exec ls -lh to preview files before deletion
  2. Use confirmation - Use -ok for interactive confirmation
  3. Be specific - Target specific directories and file types
  4. Log actions - Keep records of what was deleted
  5. Monitor results - Check disk usage before and after cleanup

Table of Contents

Navigate the scroll
Reading Progress