Delete Files Larger Than 100MB: Safe Disk Space Management with Find
Learn how to safely find and delete large files using the find command with -exec. Master disk space management techniques and avoid accidental data loss.
Delete Files Larger Than 100MB: Safe Disk Space Management with Find
Disk space management is a critical aspect of Linux system administration. When systems run low on storage, administrators need to identify and remove large files to free up space. The find command with -exec provides a powerful and flexible way to locate and delete files based on size criteria. However, this operation requires careful consideration to avoid accidental data loss.
Understanding File Size Management
Why Large File Cleanup is Important
Large files can accumulate over time and cause several issues:
- Disk space exhaustion - System becomes unresponsive when disk is full
- Performance degradation - File system operations slow down
- Backup failures - Large files can cause backup timeouts
- Application crashes - Services may fail when they can't write to disk
- Log file growth - Application logs can grow indefinitely
Common Locations for Large Files
Large files typically accumulate in:
/var/log/- Application and system logs/tmp/- Temporary files/var/cache/- Application cache files/home/- User files and downloads/opt/- Third-party applications/usr/local/- Locally installed software
Basic File Deletion with Find
Simple Find and Delete Command
# Find and delete files larger than 100MB
find /path/to/directory -type f -size +100M -exec rm -f {} \;
Command Breakdown
find /path/to/directory- Start search from specified directory-type f- Only look for files (not directories)-size +100M- Find files larger than 100MB-exec rm -f {} \;- Executerm -fon each found file{}is replaced by the file path\;ends the command
Example Usage
# Delete large files in /var/log
find /var/log -type f -size +100M -exec rm -f {} \;
# Delete large files in user home directories
find /home -type f -size +100M -exec rm -f {} \;
# Delete large files in current directory
find . -type f -size +100M -exec rm -f {} \;
Safe File Deletion Practices
1. Preview Without Deleting (Dry Run)
Always preview files before deletion:
# List files larger than 100MB without deleting
find /path/to/directory -type f -size +100M -exec ls -lh {} \;
# Example output:
# -rw-r--r-- 1 root root 104M Jan 15 10:30 /var/log/large.log
# -rw-r--r-- 1 root root 209M Jan 15 10:30 /var/log/huge.log
2. Use -ok for Confirmation
Interactive confirmation for each file:
# Ask for confirmation before deleting each file
find /var/log -type f -size +100M -ok rm {} \;
# Example interaction:
# < rm ... /var/log/large.log > ? y
# < rm ... /var/log/huge.log > ? n
3. Log Deletions
Keep a record of deleted files:
# Log deletions to a file
find /var/log -type f -size +100M -exec rm {} \; -print > deleted_files.log
# Or use tee to see output and log it
find /var/log -type f -size +100M -exec rm {} \; -print | tee deleted_files.log
4. Exclude Important Directories
Protect critical directories:
# Exclude important directories
find / -type f -size +100M -not -path "/home/*" -not -path "/opt/*" -exec rm {} \;
# Exclude specific file types
find /var/log -type f -size +100M -not -name "*.conf" -exec rm {} \;
Advanced File Deletion Techniques
Using xargs for Better Performance
# More efficient for large numbers of files
find /var/log -type f -size +100M -print0 | xargs -0 rm -f
# With confirmation
find /var/log -type f -size +100M -print0 | xargs -0 -p rm -f
Conditional Deletion Based on Age
# Delete files larger than 100MB and older than 30 days
find /var/log -type f -size +100M -mtime +30 -exec rm {} \;
# Delete files larger than 100MB and older than 7 days
find /var/log -type f -size +100M -mtime +7 -exec rm {} \;
Delete by File Extension
# Delete large log files
find /var/log -type f -size +100M -name "*.log" -exec rm {} \;
# Delete large temporary files
find /tmp -type f -size +100M -name "*.tmp" -exec rm {} \;
Comprehensive Disk Cleanup Script
Advanced Cleanup Script
#!/bin/bash
# disk_cleanup.sh
# Configuration
LOG_FILE="/var/log/disk_cleanup.log"
DRY_RUN=false
SIZE_THRESHOLD="100M"
DAYS_THRESHOLD=30
# Function to log actions
log_action() {
echo "[$(date)] $1" | tee -a "$LOG_FILE"
}
# Function to check disk usage
check_disk_usage() {
local path="$1"
local usage=$(df "$path" | awk 'NR==2 {print $5}' | sed 's/%//')
echo "$usage"
}
# Function to find large files
find_large_files() {
local path="$1"
local size="$2"
log_action "Searching for files larger than $size in $path"
if [ "$DRY_RUN" = true ]; then
find "$path" -type f -size +"$size" -exec ls -lh {} \;
else
find "$path" -type f -size +"$size" -exec rm {} \; -print
fi
}
# Function to clean log files
clean_log_files() {
local log_dir="$1"
local size="$2"
local days="$3"
log_action "Cleaning log files in $log_dir"
# Find and delete large log files
find "$log_dir" -type f -size +"$size" -name "*.log" -mtime +"$days" -exec rm {} \; -print
# Find and delete old log files
find "$log_dir" -type f -name "*.log.*" -mtime +"$days" -exec rm {} \; -print
# Find and delete old compressed logs
find "$log_dir" -type f \( -name "*.gz" -o -name "*.bz2" -o -name "*.xz" \) -mtime +"$days" -exec rm {} \; -print
}
# Function to clean temporary files
clean_temp_files() {
local temp_dir="$1"
local size="$2"
log_action "Cleaning temporary files in $temp_dir"
# Find and delete large temporary files
find "$temp_dir" -type f -size +"$size" -mtime +7 -exec rm {} \; -print
# Find and delete old temporary files
find "$temp_dir" -type f -mtime +7 -exec rm {} \; -print
}
# Function to clean cache files
clean_cache_files() {
local cache_dir="$1"
local size="$2"
log_action "Cleaning cache files in $cache_dir"
# Find and delete large cache files
find "$cache_dir" -type f -size +"$size" -mtime +30 -exec rm {} \; -print
}
# Function to generate cleanup report
generate_report() {
local report_file="/tmp/disk_cleanup_report_$(date +%Y%m%d_%H%M%S).txt"
echo "Disk Cleanup Report - $(date)" > "$report_file"
echo "=================================" >> "$report_file"
echo "" >> "$report_file"
# Disk usage before cleanup
echo "Disk Usage Before Cleanup:" >> "$report_file"
df -h >> "$report_file"
echo "" >> "$report_file"
# Large files found
echo "Large Files Found:" >> "$report_file"
find / -type f -size +"$SIZE_THRESHOLD" 2>/dev/null | head -20 >> "$report_file"
echo "" >> "$report_file"
# Top directories by size
echo "Top Directories by Size:" >> "$report_file"
du -h / 2>/dev/null | sort -hr | head -10 >> "$report_file"
log_action "Cleanup report generated: $report_file"
}
# Main cleanup function
main() {
log_action "Starting disk cleanup process"
# Check if running as root
if [ "$EUID" -ne 0 ]; then
log_action "ERROR: This script must be run as root"
exit 1
fi
# Generate initial report
generate_report
# Clean different directories
clean_log_files "/var/log" "$SIZE_THRESHOLD" "$DAYS_THRESHOLD"
clean_temp_files "/tmp" "$SIZE_THRESHOLD"
clean_cache_files "/var/cache" "$SIZE_THRESHOLD"
# Find and delete large files in specific locations
find_large_files "/var/log" "$SIZE_THRESHOLD"
find_large_files "/tmp" "$SIZE_THRESHOLD"
find_large_files "/var/cache" "$SIZE_THRESHOLD"
# Generate final report
generate_report
log_action "Disk cleanup process completed"
}
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
--dry-run)
DRY_RUN=true
shift
;;
--size)
SIZE_THRESHOLD="$2"
shift 2
;;
--days)
DAYS_THRESHOLD="$2"
shift 2
;;
--help)
echo "Usage: $0 [--dry-run] [--size SIZE] [--days DAYS]"
echo " --dry-run Show what would be deleted without actually deleting"
echo " --size Size threshold (default: 100M)"
echo " --days Age threshold in days (default: 30)"
exit 0
;;
*)
echo "Unknown option: $1"
exit 1
;;
esac
done
# Run main function
main
File Size Units and Examples
Size Unit Options
# Different size units
find /var/log -type f -size +100M # Megabytes
find /var/log -type f -size +1G # Gigabytes
find /var/log -type f -size +100k # Kilobytes
find /var/log -type f -size +100c # Bytes
Size Comparison Examples
# Files larger than 100MB
find /var/log -type f -size +100M -exec ls -lh {} \;
# Files between 50MB and 100MB
find /var/log -type f -size +50M -size -100M -exec ls -lh {} \;
# Files exactly 100MB
find /var/log -type f -size 100M -exec ls -lh {} \;
Safety and Best Practices
1. Always Test First
# Test with -exec ls -lh to see what would be deleted
find /var/log -type f -size +100M -exec ls -lh {} \;
# Test with -ok for interactive confirmation
find /var/log -type f -size +100M -ok rm {} \;
2. Use Specific Paths
# Good: Specific directory
find /var/log -type f -size +100M -exec rm {} \;
# Bad: Root directory (too broad)
find / -type f -size +100M -exec rm {} \;
3. Exclude Important Files
# Exclude configuration files
find /var/log -type f -size +100M -not -name "*.conf" -exec rm {} \;
# Exclude specific directories
find /var -type f -size +100M -not -path "/var/lib/*" -exec rm {} \;
4. Monitor Disk Usage
# Check disk usage before and after
df -h
# Monitor specific directory
du -sh /var/log
Common Pitfalls and Solutions
1. Accidental Deletion
Problem: Deleting important files
Solution: Always test with -exec ls -lh first and use -ok for confirmation
2. Permission Issues
Problem: Permission denied errors
Solution: Run with appropriate permissions or use sudo
# Run with sudo
sudo find /var/log -type f -size +100M -exec rm {} \;
3. Symbolic Links
Problem: Deleting symbolic links instead of targets
Solution: Use -type f to avoid directories and links
4. Network Filesystems
Problem: Slow performance on network filesystems
Solution: Use -maxdepth to limit search depth
# Limit search depth
find /var/log -maxdepth 2 -type f -size +100M -exec rm {} \;
Conclusion
File deletion using find with -exec is a powerful tool for disk space management, but it requires careful handling to avoid data loss. Key principles include:
- Always test first - Use
-exec ls -lhto preview files before deletion - Use confirmation - Use
-okfor interactive confirmation - Be specific - Target specific directories and file types
- Log actions - Keep records of what was deleted
- Monitor results - Check disk usage before and after cleanup