Comparing two files in Linux terminal
Categories:
Mastering File Comparison in the Linux Terminal

Learn how to effectively compare files and directories in the Linux terminal using powerful command-line utilities like diff
, cmp
, and comm
.
Comparing files is a fundamental task for developers, system administrators, and anyone working with text-based data. Whether you're tracking code changes, verifying configuration files, or simply trying to understand discrepancies between two versions of a document, the Linux terminal offers several robust tools to help. This article will guide you through the most common and powerful utilities for file comparison, explaining their nuances and best use cases.
The diff
Command: Your Go-To for Line-by-Line Differences
The diff
command is arguably the most widely used utility for comparing files. It works by analyzing two files line by line and reporting the differences. It's particularly useful for source code and configuration files, as it can show you exactly which lines have been added, deleted, or changed. The output format is designed to be easily readable and can even be used to generate patch files.
diff file1.txt file2.txt
Basic usage of the diff
command to compare two files.
The default output of diff
can sometimes be verbose. Here are some common options to refine its output:
diff -u file1.txt file2.txt # Unified format (contextual differences)
diff -r dir1 dir2 # Recursively compare directories
diff -q dir1 dir2 # Quick comparison, only reports if files differ
diff -y file1.txt file2.txt # Side-by-side comparison (requires enough terminal width)
Useful diff
options for different comparison scenarios.
sdiff
or diff -y
with a wide terminal. Many modern text editors and IDEs also integrate diff
functionality with graphical interfaces.flowchart TD A[Start Comparison] --> B{Choose Tool} B -->|Line-by-Line Differences| C[Use `diff`] B -->|Byte-by-Byte Differences| D[Use `cmp`] B -->|Common/Unique Lines| E[Use `comm`] C --> C1[Output: Added/Deleted/Changed Lines] D --> D1[Output: First Byte Difference] E --> E1[Output: Lines Unique to File1, Unique to File2, Common] C1 --> F[End] D1 --> F E1 --> F
Decision flow for choosing the right file comparison tool.
The cmp
Command: Byte-by-Byte Precision
While diff
focuses on line-based differences, the cmp
(compare) command performs a byte-by-byte comparison. It's ideal when you need to know if two files are absolutely identical, or if you're dealing with binary files where line-based comparison is irrelevant. cmp
will report the first byte and line number where the files differ, or it will simply return no output if the files are identical.
cmp file1.bin file2.bin
cmp -s file1.txt file2.txt # Suppress output, just set exit status
Using cmp
for byte-by-byte comparison, including suppressing output.
cmp
is crucial for scripting: 0
if files are identical, 1
if they differ, and 2
if an error occurred. This makes it excellent for conditional checks in shell scripts.The comm
Command: Finding Common and Unique Lines
The comm
(common) command is designed to compare two sorted files and output lines that are unique to each file, as well as lines that are common to both. It's particularly useful for set operations, like finding elements present in one list but not another, or identifying shared entries. Remember, comm
requires its input files to be sorted for accurate results.
sort file1.txt > sorted_file1.txt
sort file2.txt > sorted_file2.txt
comm sorted_file1.txt sorted_file2.txt
Preparing files for comm
and basic usage.
The output of comm
typically has three columns:
- Lines unique to
sorted_file1.txt
- Lines unique to
sorted_file2.txt
- Lines common to both files
comm -12 sorted_file1.txt sorted_file2.txt # Show only common lines
comm -23 sorted_file1.txt sorted_file2.txt # Show only lines unique to file1
comm -13 sorted_file1.txt sorted_file2.txt # Show only lines unique to file2
Filtering comm
output to show specific columns.
comm
. If they are not, the results will be incorrect and misleading.