Remove First and Last Lines from a File: Text Manipulation with Sed and Awk

Text file manipulation is a common task in Linux system administration, log processing, and data cleaning. Removing the first and last lines from files is often necessary when processing log files, CSV data, or any text files that have headers, footers, or unwanted content at the beginning and end. Understanding different methods to accomplish this task helps you choose the most appropriate tool for your specific use case.

Understanding Text File Manipulation

Why Remove First and Last Lines

Common scenarios where you need to remove first and last lines:

Log file processing - Remove headers and footers from log files
CSV data cleaning - Remove column headers and summary rows
Data extraction - Extract only the data portion from formatted files
File preprocessing - Clean up files before further processing
Report generation - Remove template headers and footers

Text manipulation is not just about removing lines—it's about understanding file structure and using the right tools to extract the data you need.

File Structure Considerations

Before removing lines, consider:

File size - Large files may require streaming tools
File format - Different formats may need different approaches
Line endings - Ensure consistent line ending handling
Backup requirements - Always backup important files before modification

Basic Line Removal Methods

Using Sed (Stream Editor)

Simple Sed Command

# Remove first and last line
sed '1d; $d' filename.txt

Command Breakdown

sed - Stream editor used to process text line-by-line
'1d' - Deletes the first line (1 = line number)
'$d' - Deletes the last line ($ = end of file)
filename.txt - The file to be processed

Together, the command says: "Delete the first line and the last line from filename.txt."

Sed with In-Place Editing

# Edit file in-place (backup original)
sed -i.bak '1d; $d' filename.txt

# Edit file in-place (no backup)
sed -i '1d; $d' filename.txt

Using Head and Tail

Head and Tail Combination

# Remove first and last line
head -n -1 filename.txt | tail -n +2

Command Breakdown

head -n -1 - Print all lines except the last one
| - Pipe output to next command
tail -n +2 - Print from line 2 to the end

Alternative Head/Tail Approach

# Remove first and last line
tail -n +2 filename.txt | head -n -1

Using Awk

Awk Line Number Filtering

# Remove first and last line
awk 'NR>1 && NR<NF' filename.txt

Command Breakdown

NR>1 - Skip the first line (line number > 1)
&& - Logical AND operator
NR<NF - Skip the last line (line number < total lines)

Awk with Line Count

# Remove first and last line (more robust)
awk 'NR>1 && NR<(FNR+1)' filename.txt

Practical Examples

Log File Processing

Remove Log Headers and Footers

# Remove first and last lines from log file
sed '1d; $d' /var/log/application.log

# Remove first 3 lines (headers) and last 2 lines (footers)
sed '1,3d; $-1,$d' /var/log/application.log

Process Multiple Log Files

# Process all log files in directory
for file in /var/log/*.log; do
    sed '1d; $d' "$file" > "${file}.processed"
done

CSV Data Cleaning

Remove CSV Headers and Footers

# Remove header and footer from CSV
sed '1d; $d' data.csv > cleaned_data.csv

# Remove first 2 lines (headers) and last line (summary)
sed '1,2d; $d' data.csv > cleaned_data.csv

Process CSV with Awk

# Remove first and last line, process CSV
awk 'NR>1 && NR<NF {print $0}' data.csv > cleaned_data.csv

Data Extraction

Extract Data from Reports

# Remove report headers and footers
sed '1,5d; $-3,$d' report.txt > data_only.txt

# Extract only data lines (skip headers and summaries)
awk 'NR>10 && NR<(FNR-5)' report.txt > data_only.txt

Advanced Text Manipulation

Multiple Line Removal

Remove First N and Last M Lines

# Remove first 3 and last 2 lines
sed '1,3d; $d' filename.txt

# Remove first 2 and last 3 lines
sed '1,2d; $-2,$d' filename.txt

Remove Lines Based on Content

# Remove lines containing specific text
sed '/^Header/d; /^Footer/d' filename.txt

# Remove empty lines and first/last
sed '/^$/d; 1d; $d' filename.txt

Conditional Line Removal

Remove Lines Based on Patterns

# Remove first and last line, plus lines containing "ERROR"
sed '1d; $d; /ERROR/d' filename.txt

# Remove first and last line, keep only lines with numbers
sed '1d; $d' filename.txt | grep '[0-9]'

Remove Lines Based on Line Numbers

# Remove first 5 and last 3 lines
sed '1,5d; $-2,$d' filename.txt

# Remove every 10th line plus first and last
sed '1d; $d; 10~10d' filename.txt

Comprehensive Text Processing Script

Advanced File Cleaner

#!/bin/bash
# file_cleaner.sh

# Configuration
INPUT_FILE=""
OUTPUT_FILE=""
REMOVE_FIRST=1
REMOVE_LAST=1
REMOVE_EMPTY=false
REMOVE_PATTERN=""
BACKUP=true

# Function to show usage
show_usage() {
    echo "Usage: $0 [OPTIONS] INPUT_FILE"
    echo "Options:"
    echo "  -o OUTPUT_FILE    Output file (default: stdout)"
    echo "  -f FIRST_LINES    Number of first lines to remove (default: 1)"
    echo "  -l LAST_LINES     Number of last lines to remove (default: 1)"
    echo "  -e                Remove empty lines"
    echo "  -p PATTERN        Remove lines matching pattern"
    echo "  -n                No backup"
    echo "  -h                Show this help"
}

# Function to process file
process_file() {
    local input="$1"
    local output="$2"
    local first="$3"
    local last="$4"
    local empty="$5"
    local pattern="$6"
    
    # Build sed command
    local sed_cmd=""
    
    # Remove first lines
    if [ "$first" -gt 0 ]; then
        if [ "$first" -eq 1 ]; then
            sed_cmd="1d"
        else
            sed_cmd="1,${first}d"
        fi
    fi
    
    # Remove last lines
    if [ "$last" -gt 0 ]; then
        if [ "$last" -eq 1 ]; then
            if [ -n "$sed_cmd" ]; then
                sed_cmd="${sed_cmd}; $d"
            else
                sed_cmd="$d"
            fi
        else
            if [ -n "$sed_cmd" ]; then
                sed_cmd="${sed_cmd}; $-$(($last-1)),$d"
            else
                sed_cmd="$-$(($last-1)),$d"
            fi
        fi
    fi
    
    # Remove empty lines
    if [ "$empty" = true ]; then
        if [ -n "$sed_cmd" ]; then
            sed_cmd="${sed_cmd}; /^$/d"
        else
            sed_cmd="/^$/d"
        fi
    fi
    
    # Remove lines matching pattern
    if [ -n "$pattern" ]; then
        if [ -n "$sed_cmd" ]; then
            sed_cmd="${sed_cmd}; /${pattern}/d"
        else
            sed_cmd="/${pattern}/d"
        fi
    fi
    
    # Execute sed command
    if [ -n "$output" ]; then
        sed "$sed_cmd" "$input" > "$output"
    else
        sed "$sed_cmd" "$input"
    fi
}

# Function to create backup
create_backup() {
    local file="$1"
    local backup="${file}.backup.$(date +%Y%m%d_%H%M%S)"
    
    cp "$file" "$backup"
    echo "Backup created: $backup"
}

# Parse command line arguments
while getopts "o:f:l:ep:nh" opt; do
    case $opt in
        o)
            OUTPUT_FILE="$OPTARG"
            ;;
        f)
            REMOVE_FIRST="$OPTARG"
            ;;
        l)
            REMOVE_LAST="$OPTARG"
            ;;
        e)
            REMOVE_EMPTY=true
            ;;
        p)
            REMOVE_PATTERN="$OPTARG"
            ;;
        n)
            BACKUP=false
            ;;
        h)
            show_usage
            exit 0
            ;;
        \?)
            echo "Invalid option: -$OPTARG" >&2
            show_usage
            exit 1
            ;;
    esac
done

shift $((OPTIND-1))

# Check for input file
if [ $# -eq 0 ]; then
    echo "Error: Input file required" >&2
    show_usage
    exit 1
fi

INPUT_FILE="$1"

# Check if input file exists
if [ ! -f "$INPUT_FILE" ]; then
    echo "Error: Input file '$INPUT_FILE' not found" >&2
    exit 1
fi

# Create backup if requested
if [ "$BACKUP" = true ] && [ -n "$OUTPUT_FILE" ]; then
    create_backup "$INPUT_FILE"
fi

# Process file
echo "Processing file: $INPUT_FILE"
echo "Removing first $REMOVE_FIRST line(s) and last $REMOVE_LAST line(s)"

if [ "$REMOVE_EMPTY" = true ]; then
    echo "Removing empty lines"
fi

if [ -n "$REMOVE_PATTERN" ]; then
    echo "Removing lines matching pattern: $REMOVE_PATTERN"
fi

process_file "$INPUT_FILE" "$OUTPUT_FILE" "$REMOVE_FIRST" "$REMOVE_LAST" "$REMOVE_EMPTY" "$REMOVE_PATTERN"

if [ -n "$OUTPUT_FILE" ]; then
    echo "Output written to: $OUTPUT_FILE"
else
    echo "Output written to stdout"
fi

Performance Considerations

Large File Processing

Streaming Processing

# Process large files without loading into memory
sed '1d; $d' large_file.txt > processed_file.txt

# Use head and tail for very large files
head -n -1 large_file.txt | tail -n +2 > processed_file.txt

Memory-Efficient Processing

# Process file in chunks
split -l 1000 large_file.txt chunk_
for chunk in chunk_*; do
    sed '1d; $d' "$chunk" >> processed_file.txt
done
rm chunk_*

Performance Comparison

Benchmark Different Methods

# Time different approaches
time sed '1d; $d' large_file.txt > /dev/null
time head -n -1 large_file.txt | tail -n +2 > /dev/null
time awk 'NR>1 && NR<NF' large_file.txt > /dev/null

Best Practices

1. Always Backup Important Files

# Create backup before modification
cp important_file.txt important_file.txt.backup
sed '1d; $d' important_file.txt > temp_file.txt
mv temp_file.txt important_file.txt

2. Test Commands on Sample Data

# Test on small sample first
head -10 large_file.txt > sample.txt
sed '1d; $d' sample.txt

3. Use Appropriate Tools

# For simple line removal: sed
sed '1d; $d' file.txt

# For complex filtering: awk
awk 'NR>1 && NR<NF && /pattern/' file.txt

# For large files: head/tail
head -n -1 file.txt | tail -n +2

4. Validate Results

# Check line counts before and after
wc -l original_file.txt
wc -l processed_file.txt

# Verify first and last lines
head -1 processed_file.txt
tail -1 processed_file.txt

Common Pitfalls and Solutions

1. Empty Files

Problem: Commands fail on empty files Solution: Check file size before processing

# Check if file is empty
if [ -s "$file" ]; then
    sed '1d; $d' "$file"
else
    echo "File is empty"
fi

2. Single Line Files

Problem: Removing first and last line from single-line file results in empty output Solution: Check line count first

# Check line count
line_count=$(wc -l < "$file")
if [ "$line_count" -gt 2 ]; then
    sed '1d; $d' "$file"
else
    echo "File has too few lines"
fi

3. Binary Files

Problem: Text processing tools may corrupt binary files Solution: Check file type before processing

# Check if file is text
if file "$file" | grep -q "text"; then
    sed '1d; $d' "$file"
else
    echo "File is not a text file"
fi

Conclusion

Removing first and last lines from files is a common text manipulation task that can be accomplished using various tools. Each method has its advantages:

sed - Most flexible and commonly used
head and tail - Good for large files and streaming
awk - Best for complex filtering and processing

Key takeaways:

Choose the right tool - Different tools for different scenarios
Always backup important files - Prevent data loss
Test on sample data - Verify commands before processing large files
Consider performance - Use streaming tools for large files
Validate results - Check output to ensure correctness

Remember: Text manipulation is not just about removing lines—it's about understanding file structure and using the right tools to extract the data you need efficiently and safely.

Table of Contents