/var Directory Full: Emergency Disk Space Recovery Guide

It's 3 AM, and your monitoring alerts are screaming that /var is 90% full. Your heart races as you realize this could bring down your entire system. The /var directory is critical—it contains logs, caches, spools, and runtime data that your system needs to function. When it fills up, services can crash, applications can fail, and your system can become unstable. But don't panic! With the right approach, you can quickly identify the culprit and free up space before disaster strikes.

Understanding the /var Directory

What Makes /var Critical

The /var directory is where your system stores variable data that changes during normal operation:

/var/log - System and application logs
/var/cache - Package manager caches and temporary data
/var/lib - Application state and databases
/var/spool - Mail queues and print jobs
/var/tmp - Temporary files that persist across reboots

When `/var` fills up, critical services like logging, package management, and mail delivery can fail, potentially causing system instability.

Common Causes of /var Space Issues

Log file accumulation - Unrotated logs growing indefinitely
Docker artifacts - Unused containers, images, and volumes
Package caches - Accumulated package manager data
Application data - Databases and application state files
Email spools - Undelivered mail accumulating
Crash dumps - System crash reports and core dumps

Emergency Response: Quick Diagnosis

Step 1: Identify the Space Hog

First, find out what's consuming the most space:

# Get a quick overview of /var usage
sudo du -sh /var/* | sort -hr | head -10

# More detailed analysis
sudo du -sh /var/*/* | sort -hr | head -20

# Check total /var usage
df -h /var

Example output:

2.1G    /var/log
1.8G    /var/lib/docker
800M    /var/cache
400M    /var/lib
200M    /var/spool

The `du` command shows disk usage, while `df` shows filesystem usage. Both are useful for understanding space consumption.

Step 2: Check System Logs

Log files are often the biggest culprits:

# Check log directory usage
sudo du -sh /var/log/* | sort -hr

# Look for large log files
sudo find /var/log -type f -size +100M -exec ls -lh {} \;

# Check journald usage
sudo journalctl --disk-usage

Step 3: Identify Specific Problems

# Check for rotated logs that weren't cleaned up
ls -la /var/log/*.gz /var/log/*.[0-9]

# Check Docker usage
docker system df

# Check package cache
du -sh /var/cache/apt /var/cache/yum 2>/dev/null

Targeted Cleanup Strategies

Strategy 1: Clean Log Files

System Logs

# Clean journald logs (keep last 200MB)
sudo journalctl --vacuum-size=200M

# Remove old rotated logs
sudo rm -rf /var/log/*.gz /var/log/*.[0-9]

# Truncate large active log files
sudo truncate -s 0 /var/log/syslog
sudo truncate -s 0 /var/log/kern.log

Application Logs

# Find and clean application logs
sudo find /var/log -name "*.log" -size +100M -exec truncate -s 0 {} \;

# Clean specific application logs
sudo truncate -s 0 /var/log/nginx/access.log
sudo truncate -s 0 /var/log/apache2/access.log

Always truncate log files instead of deleting them to avoid breaking applications that might have them open.

Strategy 2: Clean Package Caches

Debian/Ubuntu Systems

# Clean apt cache
sudo apt clean
sudo apt autoclean
sudo apt autoremove

# Remove orphaned packages
sudo apt autoremove --purge

RHEL/CentOS Systems

# Clean yum cache
sudo yum clean all
sudo yum autoremove

# Clean dnf cache (newer systems)
sudo dnf clean all
sudo dnf autoremove

Strategy 3: Clean Docker Artifacts

# Check Docker disk usage
docker system df

# Remove unused containers, networks, images
docker system prune -a

# Remove unused volumes (be careful!)
docker volume prune

# Remove specific large images
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}" | sort -k3 -hr
docker rmi <image-id>

Docker prune commands remove unused resources. Be cautious on production systems where you might need those resources later.

Strategy 4: Clean Application Data

# Check for large application directories
sudo du -sh /var/lib/* | sort -hr

# Clean application caches
sudo rm -rf /var/lib/apt/lists/*
sudo rm -rf /var/cache/apt/archives/*

# Clean temporary files
sudo rm -rf /var/tmp/*

Advanced Cleanup Techniques

Automated Log Rotation

Set up proper log rotation to prevent future issues:

# Edit logrotate configuration
sudo nano /etc/logrotate.conf

# Add custom log rotation rules
sudo nano /etc/logrotate.d/custom

Example logrotate configuration:

/var/log/myapp/*.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 644 www-data www-data
    postrotate
        systemctl reload myapp
    endscript
}

Monitoring and Alerts

Set up proactive monitoring:

# Install disk usage monitoring tools
sudo apt install ncdu duf

# Use ncdu for interactive disk usage analysis
sudo ncdu /var

# Set up automated cleanup script
cat > /usr/local/bin/cleanup-var.sh << 'EOF'
#!/bin/bash
# Cleanup script for /var directory

# Clean journald logs
journalctl --vacuum-size=200M

# Clean package caches
apt clean 2>/dev/null || yum clean all 2>/dev/null

# Clean old log files
find /var/log -name "*.gz" -mtime +30 -delete
find /var/log -name "*.[0-9]" -mtime +30 -delete

# Clean Docker if available
if command -v docker &> /dev/null; then
    docker system prune -f
fi

echo "Cleanup completed at $(date)"
EOF

chmod +x /usr/local/bin/cleanup-var.sh

Scheduled Cleanup

# Add to crontab for weekly cleanup
sudo crontab -e

# Add this line for weekly cleanup at 2 AM
0 2 * * 0 /usr/local/bin/cleanup-var.sh

Prevention Strategies

1. Implement Log Rotation

# Configure system-wide log rotation
sudo nano /etc/logrotate.conf

# Set appropriate rotation policies
weekly
rotate 4
compress
delaycompress
missingok
notifempty

2. Set Up Monitoring

# Install monitoring tools
sudo apt install prometheus-node-exporter

# Configure disk usage alerts
# Set alerts for /var usage above 80%

3. Use External Log Storage

# Configure log forwarding to external systems
# - ELK Stack (Elasticsearch, Logstash, Kibana)
# - Splunk
# - CloudWatch Logs
# - Syslog servers

4. Implement Quotas

# Set up disk quotas for /var
sudo apt install quota

# Enable quotas on /var filesystem
sudo nano /etc/fstab
# Add usrquota,grpquota to /var mount options

Real-World Scenarios

Scenario 1: Log File Explosion

Problem: Application logs growing to several GB Solution:

# Find the culprit
sudo find /var/log -name "*.log" -size +1G

# Truncate large logs
sudo truncate -s 0 /var/log/large-app.log

# Set up proper log rotation
sudo nano /etc/logrotate.d/large-app

Scenario 2: Docker Image Accumulation

Problem: Docker images consuming 10GB+ in /var/lib/docker Solution:

# Check Docker usage
docker system df

# Remove unused images
docker image prune -a

# Set up automated cleanup
echo "0 2 * * * docker system prune -f" | sudo crontab -

Scenario 3: Package Cache Buildup

Problem: Package manager cache consuming 5GB+ Solution:

# Clean package caches
sudo apt clean
sudo apt autoclean

# Set up automatic cleanup
sudo nano /etc/apt/apt.conf.d/99cleanup

Best Practices for /var Management

1. Regular Monitoring

# Daily disk usage check
df -h /var

# Weekly detailed analysis
sudo du -sh /var/* | sort -hr

2. Proactive Cleanup

# Automated cleanup scripts
# Regular log rotation
# Package cache management
# Docker image cleanup

3. Proper Logging Configuration

# Configure applications for appropriate log levels
# Use structured logging
# Implement log aggregation
# Set up log retention policies

4. Capacity Planning

# Monitor growth trends
# Plan for increased storage needs
# Implement tiered storage
# Use external log storage

Diagnose first - Find out what's consuming the space
Clean strategically - Target the biggest space consumers
Prevent recurrence - Set up proper log rotation and monitoring
Monitor proactively - Don't wait for the next crisis

Remember:

Log files are usually the culprit - Check /var/log first
Use truncate, not delete - For active log files
Set up automation - Prevent future issues
Monitor continuously - Don't wait for alerts

The best approach is prevention. Set up proper log rotation, monitoring, and automated cleanup to avoid /var space issues in the future.

Table of Contents