/var Directory Full: Emergency Disk Space Recovery Guide
Learn how to quickly diagnose and resolve /var directory space issues. Discover the most common causes and effective cleanup strategies to prevent system failures.
/var Directory Full: Emergency Disk Space Recovery Guide
It's 3 AM, and your monitoring alerts are screaming that /var is 90% full. Your heart races as you realize this could bring down your entire system. The /var directory is critical—it contains logs, caches, spools, and runtime data that your system needs to function. When it fills up, services can crash, applications can fail, and your system can become unstable. But don't panic! With the right approach, you can quickly identify the culprit and free up space before disaster strikes.
Understanding the /var Directory
What Makes /var Critical
The /var directory is where your system stores variable data that changes during normal operation:
/var/log- System and application logs/var/cache- Package manager caches and temporary data/var/lib- Application state and databases/var/spool- Mail queues and print jobs/var/tmp- Temporary files that persist across reboots
Common Causes of /var Space Issues
- Log file accumulation - Unrotated logs growing indefinitely
- Docker artifacts - Unused containers, images, and volumes
- Package caches - Accumulated package manager data
- Application data - Databases and application state files
- Email spools - Undelivered mail accumulating
- Crash dumps - System crash reports and core dumps
Emergency Response: Quick Diagnosis
Step 1: Identify the Space Hog
First, find out what's consuming the most space:
# Get a quick overview of /var usage
sudo du -sh /var/* | sort -hr | head -10
# More detailed analysis
sudo du -sh /var/*/* | sort -hr | head -20
# Check total /var usage
df -h /var
Example output:
2.1G /var/log
1.8G /var/lib/docker
800M /var/cache
400M /var/lib
200M /var/spool
Step 2: Check System Logs
Log files are often the biggest culprits:
# Check log directory usage
sudo du -sh /var/log/* | sort -hr
# Look for large log files
sudo find /var/log -type f -size +100M -exec ls -lh {} \;
# Check journald usage
sudo journalctl --disk-usage
Step 3: Identify Specific Problems
# Check for rotated logs that weren't cleaned up
ls -la /var/log/*.gz /var/log/*.[0-9]
# Check Docker usage
docker system df
# Check package cache
du -sh /var/cache/apt /var/cache/yum 2>/dev/null
Targeted Cleanup Strategies
Strategy 1: Clean Log Files
System Logs
# Clean journald logs (keep last 200MB)
sudo journalctl --vacuum-size=200M
# Remove old rotated logs
sudo rm -rf /var/log/*.gz /var/log/*.[0-9]
# Truncate large active log files
sudo truncate -s 0 /var/log/syslog
sudo truncate -s 0 /var/log/kern.log
Application Logs
# Find and clean application logs
sudo find /var/log -name "*.log" -size +100M -exec truncate -s 0 {} \;
# Clean specific application logs
sudo truncate -s 0 /var/log/nginx/access.log
sudo truncate -s 0 /var/log/apache2/access.log
Strategy 2: Clean Package Caches
Debian/Ubuntu Systems
# Clean apt cache
sudo apt clean
sudo apt autoclean
sudo apt autoremove
# Remove orphaned packages
sudo apt autoremove --purge
RHEL/CentOS Systems
# Clean yum cache
sudo yum clean all
sudo yum autoremove
# Clean dnf cache (newer systems)
sudo dnf clean all
sudo dnf autoremove
Strategy 3: Clean Docker Artifacts
# Check Docker disk usage
docker system df
# Remove unused containers, networks, images
docker system prune -a
# Remove unused volumes (be careful!)
docker volume prune
# Remove specific large images
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}" | sort -k3 -hr
docker rmi <image-id>
Strategy 4: Clean Application Data
# Check for large application directories
sudo du -sh /var/lib/* | sort -hr
# Clean application caches
sudo rm -rf /var/lib/apt/lists/*
sudo rm -rf /var/cache/apt/archives/*
# Clean temporary files
sudo rm -rf /var/tmp/*
Advanced Cleanup Techniques
Automated Log Rotation
Set up proper log rotation to prevent future issues:
# Edit logrotate configuration
sudo nano /etc/logrotate.conf
# Add custom log rotation rules
sudo nano /etc/logrotate.d/custom
Example logrotate configuration:
/var/log/myapp/*.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
create 644 www-data www-data
postrotate
systemctl reload myapp
endscript
}
Monitoring and Alerts
Set up proactive monitoring:
# Install disk usage monitoring tools
sudo apt install ncdu duf
# Use ncdu for interactive disk usage analysis
sudo ncdu /var
# Set up automated cleanup script
cat > /usr/local/bin/cleanup-var.sh << 'EOF'
#!/bin/bash
# Cleanup script for /var directory
# Clean journald logs
journalctl --vacuum-size=200M
# Clean package caches
apt clean 2>/dev/null || yum clean all 2>/dev/null
# Clean old log files
find /var/log -name "*.gz" -mtime +30 -delete
find /var/log -name "*.[0-9]" -mtime +30 -delete
# Clean Docker if available
if command -v docker &> /dev/null; then
docker system prune -f
fi
echo "Cleanup completed at $(date)"
EOF
chmod +x /usr/local/bin/cleanup-var.sh
Scheduled Cleanup
# Add to crontab for weekly cleanup
sudo crontab -e
# Add this line for weekly cleanup at 2 AM
0 2 * * 0 /usr/local/bin/cleanup-var.sh
Prevention Strategies
1. Implement Log Rotation
# Configure system-wide log rotation
sudo nano /etc/logrotate.conf
# Set appropriate rotation policies
weekly
rotate 4
compress
delaycompress
missingok
notifempty
2. Set Up Monitoring
# Install monitoring tools
sudo apt install prometheus-node-exporter
# Configure disk usage alerts
# Set alerts for /var usage above 80%
3. Use External Log Storage
# Configure log forwarding to external systems
# - ELK Stack (Elasticsearch, Logstash, Kibana)
# - Splunk
# - CloudWatch Logs
# - Syslog servers
4. Implement Quotas
# Set up disk quotas for /var
sudo apt install quota
# Enable quotas on /var filesystem
sudo nano /etc/fstab
# Add usrquota,grpquota to /var mount options
Real-World Scenarios
Scenario 1: Log File Explosion
Problem: Application logs growing to several GB Solution:
# Find the culprit
sudo find /var/log -name "*.log" -size +1G
# Truncate large logs
sudo truncate -s 0 /var/log/large-app.log
# Set up proper log rotation
sudo nano /etc/logrotate.d/large-app
Scenario 2: Docker Image Accumulation
Problem: Docker images consuming 10GB+ in /var/lib/docker Solution:
# Check Docker usage
docker system df
# Remove unused images
docker image prune -a
# Set up automated cleanup
echo "0 2 * * * docker system prune -f" | sudo crontab -
Scenario 3: Package Cache Buildup
Problem: Package manager cache consuming 5GB+ Solution:
# Clean package caches
sudo apt clean
sudo apt autoclean
# Set up automatic cleanup
sudo nano /etc/apt/apt.conf.d/99cleanup
Best Practices for /var Management
1. Regular Monitoring
# Daily disk usage check
df -h /var
# Weekly detailed analysis
sudo du -sh /var/* | sort -hr
2. Proactive Cleanup
# Automated cleanup scripts
# Regular log rotation
# Package cache management
# Docker image cleanup
3. Proper Logging Configuration
# Configure applications for appropriate log levels
# Use structured logging
# Implement log aggregation
# Set up log retention policies
4. Capacity Planning
# Monitor growth trends
# Plan for increased storage needs
# Implement tiered storage
# Use external log storage
Common Pitfalls and Solutions
Pitfall 1: Deleting Active Log Files
Problem: Deleting log files that applications have open
Solution: Use truncate instead of rm for active log files
Pitfall 2: Over-aggressive Cleanup
Problem: Removing files that are still needed Solution: Always backup important data before cleanup
Pitfall 3: Ignoring Root Causes
Problem: Only treating symptoms, not causes Solution: Investigate why logs are growing so large
Pitfall 4: No Monitoring
Problem: Not knowing about space issues until it's too late Solution: Set up proactive monitoring and alerts
Conclusion
A full /var directory is a common but serious issue that can bring down your system. The key is to act quickly and systematically:
- Diagnose first - Find out what's consuming the space
- Clean strategically - Target the biggest space consumers
- Prevent recurrence - Set up proper log rotation and monitoring
- Monitor proactively - Don't wait for the next crisis
Remember:
- Log files are usually the culprit - Check
/var/logfirst - Use truncate, not delete - For active log files
- Set up automation - Prevent future issues
- Monitor continuously - Don't wait for alerts