How to diff directories over ssh

Learn how to diff directories over ssh with practical examples, diagrams, and best practices. Covers linux, shell development techniques with visual explanations.

Efficiently Diff Directories Over SSH

Hero image for How to diff directories over ssh

Learn how to compare the contents of two directories on a remote server, or between local and remote, using SSH and common Linux utilities.

Comparing directories is a common task for developers and system administrators. Whether you're verifying deployments, checking for configuration drift, or simply synchronizing files, knowing how to perform a directory diff remotely over SSH is invaluable. This article will guide you through various methods, from simple diff commands to more robust rsync techniques, ensuring you can effectively manage your remote file systems.

Understanding the Challenge: Remote File Comparison

Directly comparing directories on a remote server or between a local machine and a remote server presents a challenge because diff typically operates on local files. To overcome this, we leverage SSH to execute commands on the remote host or to securely transfer file listings for comparison. The primary goal is to identify differences in file names, sizes, modification times, and content without necessarily downloading all files.

flowchart TD
    A[Local Machine] -->|SSH Connection| B[Remote Server]
    B --> C{Directory A}
    B --> D{Directory B}
    C -- Compare --> E[Differences]
    D -- Compare --> E
    E -->|Report| A

Conceptual flow of comparing remote directories via SSH.

Method 1: Using diff with SSH and tar

One effective way to compare two directories on a remote server is to create compressed archives of each directory, transfer them to a temporary location (or stream them), and then use diff on the extracted contents. This method is particularly useful when you need a detailed content comparison.

ssh user@remote_host 'tar -cf - /path/to/dir1' | tar -xf - -C /tmp/dir1_local
ssh user@remote_host 'tar -cf - /path/to/dir2' | tar -xf - -C /tmp/dir2_local
diff -r /tmp/dir1_local /tmp/dir2_local

Comparing two remote directories by streaming tar archives locally.

Method 2: Leveraging rsync for Efficient Comparison

rsync is a powerful utility for synchronizing files and directories, but it also has excellent capabilities for comparing them. By using the --dry-run (-n) and --itemize-changes (-i) flags, rsync can show you exactly what would change without actually performing any transfers. This is often the most efficient method for checking differences between a local and remote directory.

rsync -avn --delete /path/to/local/dir/ user@remote_host:/path/to/remote/dir/

Using rsync in dry-run mode to compare local and remote directories.

The output of rsync -avn will show you files that are different, missing, or extra. The --delete flag is crucial if you want to see files that exist remotely but not locally (or vice-versa, depending on the direction of the sync). The -i (itemize-changes) flag provides a more detailed breakdown of why a file is considered different (e.g., size, modification time, permissions).

rsync -avni --delete /path/to/local/dir/ user@remote_host:/path/to/remote/dir/

Detailed rsync dry-run output with itemized changes.

Method 3: Comparing Directory Listings

For a quick check of file names and basic attributes (like size or modification time), you can compare the output of ls -lR from both directories. This method is less resource-intensive than content-based diffs but won't tell you about content differences within files.

diff <(ls -lR /path/to/local/dir) <(ssh user@remote_host 'ls -lR /path/to/remote/dir')

Comparing ls -lR output between local and remote directories.

This command uses process substitution (<()) to feed the output of ls -lR from the local directory and the remote directory (executed via SSH) directly into the diff command. This allows for a line-by-line comparison of the directory listings.