Today I discovered powerful Unix tools for process investigation and management that make system debugging much more efficient and safer.

Using lsof to Scan Processes by Path

The lsof (list open files) command can identify processes that are using files in a specific directory path, which is invaluable for debugging and system maintenance.

Basic lsof Path Scanning:

Find Processes Using a Directory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# See all processes accessing files in /var/log
lsof +D /var/log

# More efficient for large directories (doesn't recurse)
lsof +d /var/log

# Find processes with files open in current directory
lsof +d .

# Find processes using any files under /home/user
lsof +D /home/user

Practical Use Cases:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Before unmounting a filesystem
lsof +D /mnt/external-drive

# Debug why a directory can't be deleted
lsof +D /tmp/app-cache

# Find processes preventing package updates
lsof +D /usr/lib/myapp

# Identify processes holding log files open
lsof +D /var/log/myapp

Advanced lsof Usage:

Combine with Other Filters:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Processes in path owned by specific user
lsof +D /home/user -u user

# Network connections from processes in specific path
lsof +D /opt/myapp -i

# Find processes with deleted files still open
lsof +D /var/lib/myapp | grep '(deleted)'

# Monitor real-time file access
lsof +D /var/log -r 2  # refresh every 2 seconds

Output Interpretation:

1
2
3
4
# lsof output columns explained
$ lsof +d /tmp
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
chrome   1234 user   15u   REG    8,1    12345  67890 /tmp/temp_file
  • COMMAND: Process name
  • PID: Process ID
  • USER: Process owner
  • FD: File descriptor (15u = file descriptor 15, read/write)
  • TYPE: File type (REG = regular file, DIR = directory)
  • DEVICE: Device identifier
  • SIZE/OFF: File size or offset
  • NODE: Inode number
  • NAME: Full path to file

Troubleshooting Scenarios:

“Device or resource busy” Errors:

1
2
3
4
5
6
7
8
9
# Can't unmount filesystem
umount: /mnt/data: device is busy.

# Find the culprit
lsof +D /mnt/data
# Shows processes still accessing files on the mounted filesystem

# Alternative approach
fuser -v /mnt/data  # Shows processes using the mount point

Disk Space Issues:

1
2
3
4
5
# Find processes with large deleted files still open
lsof +D /var/log | grep deleted | sort -k7 -nr

# Find processes writing to a specific directory
lsof +D /var/log -a -w  # -a = AND, -w = write access only

Advanced kill Command with Verbose Signals

The kill command accepts human-readable signal names, making process management safer and more self-documenting.

Verbose Signal Usage:

Common Readable Signals:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Graceful termination (allows cleanup)
kill -TERM 1234
kill -SIGTERM 1234

# Force termination (immediate, no cleanup)
kill -KILL 1234
kill -SIGKILL 1234

# Stop/pause process (can be resumed)
kill -STOP 1234
kill -TSTP 1234   # Terminal stop (Ctrl+Z equivalent)

# Resume stopped process
kill -CONT 1234
kill -SIGCONT 1234

# Reload configuration (common in daemons)
kill -HUP 1234
kill -SIGHUP 1234

Advanced Signal Management:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Check if process is running (signal 0 doesn't affect process)
if kill -0 1234 2>/dev/null; then
    echo "Process 1234 is running"
else
    echo "Process 1234 is not running"
fi

# Graceful restart script
graceful_restart() {
    local pid=$1
    
    echo "Sending TERM signal to process $pid"
    kill -TERM $pid
    
    # Wait up to 10 seconds for graceful shutdown
    for i in {1..10}; do
        if ! kill -0 $pid 2>/dev/null; then
            echo "Process terminated gracefully"
            return 0
        fi
        sleep 1
    done
    
    echo "Process didn't terminate gracefully, forcing..."
    kill -KILL $pid
}

Combining lsof and kill:

Advanced Process Management:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Kill all processes using files in a directory
lsof +D /path/to/app | awk 'NR>1 {print $2}' | sort -u | xargs kill -TERM

# More precise version with error handling
kill_processes_in_path() {
    local path=$1
    local signal=${2:-TERM}
    
    echo "Finding processes using files in $path"
    local pids=$(lsof +D "$path" 2>/dev/null | awk 'NR>1 {print $2}' | sort -u)
    
    if [ -z "$pids" ]; then
        echo "No processes found using files in $path"
        return 0
    fi
    
    echo "Found processes: $pids"
    echo "Sending $signal signal..."
    
    for pid in $pids; do
        if kill -0 "$pid" 2>/dev/null; then
            echo "Killing process $pid"
            kill -"$signal" "$pid"
        fi
    done
}

# Usage
kill_processes_in_path /opt/myapp TERM

Service Management Patterns:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Safe service restart
restart_service() {
    local service_path=$1
    
    # Find main process
    local main_pid=$(pgrep -f "$service_path/bin/main")
    
    if [ -n "$main_pid" ]; then
        echo "Stopping service (PID: $main_pid)"
        kill -TERM "$main_pid"
        
        # Wait and verify
        sleep 5
        if kill -0 "$main_pid" 2>/dev/null; then
            echo "Service didn't stop gracefully, forcing..."
            kill -KILL "$main_pid"
        fi
    fi
    
    # Clean up any remaining processes
    lsof +D "$service_path" | awk 'NR>1 {print $2}' | sort -u | while read pid; do
        if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
            echo "Cleaning up remaining process: $pid"
            kill -TERM "$pid"
        fi
    done
}

These tools provide powerful capabilities for system administration, debugging, and process management, making it easier to understand what processes are doing and manage them safely.