diff'ing diffs with diff?
Categories:
Diff'ing Diffs with Diff: Advanced Patch Analysis

Explore advanced techniques for comparing two patch files using the diff
utility, understanding the nuances of patch formats, and interpreting the results for effective code review and version control.
In the world of version control and collaborative development, diff
is an indispensable tool for understanding changes between files. But what happens when you need to compare two sets of changes? That is, how do you 'diff a diff'? This article delves into the practical applications and methods for comparing two patch files (the output of diff
) using the diff
utility itself. This technique is invaluable for reviewing proposed changes against existing patches, understanding divergent development paths, or verifying patch integrity.
Understanding Patch Files
Before we can compare two diffs, it's crucial to understand the structure of a patch file. A patch file, typically generated by diff -u
(unified format), describes changes between two versions of a file or directory. It includes header lines indicating the original and new file paths, timestamps, and then hunks of changes. Each hunk starts with @@ -old_start,old_lines +new_start,new_lines @@
and contains lines prefixed with (context),
-
(removed), or +
(added).
--- a/file.txt 2023-10-26 10:00:00.000000000 -0700
+++ b/file.txt 2023-10-26 10:05:00.000000000 -0700
@@ -1,4 +1,4 @@
This is line 1.
-This is line 2 (old).
+This is line 2 (new).
This is line 3.
This is line 4.
Example of a unified diff format patch hunk.
The Core Concept: Diffing the Text Output
The simplest and most direct way to 'diff a diff' is to treat the patch files themselves as plain text files and run diff
on them. When you compare patch1.diff
and patch2.diff
, the diff
utility will highlight the differences between the text content of these two files. This means it will show you where the lines describing changes (the +
, -
, and context lines) differ between the two patches.
flowchart TD A[Original Codebase] --> B{Generate Patch 1} B --> C[patch1.diff] A --> D{Generate Patch 2} D --> E[patch2.diff] C --> F{diff patch1.diff patch2.diff} E --> F F --> G[Comparison Result]
Workflow for diffing two patch files.
# Create two example patch files
# patch1.diff changes 'old' to 'new'
diff -u <(echo -e "line1\nline2 (old)\nline3") <(echo -e "line1\nline2 (new)\nline3") > patch1.diff
# patch2.diff changes 'old' to 'newer'
diff -u <(echo -e "line1\nline2 (old)\nline3") <(echo -e "line1\nline2 (newer)\nline3") > patch2.diff
# Now, diff the diffs
diff -u patch1.diff patch2.diff
Command-line example of generating and diffing two patch files.
The output of diff -u patch1.diff patch2.diff
will show you exactly which lines in the patch files themselves are different. For instance, if patch1.diff
changed line2 (old)
to line2 (new)
and patch2.diff
changed line2 (old)
to line2 (newer)
, the output of the 'diff of diffs' would highlight the difference between +line2 (new)
and +line2 (newer)
.
diff
options (e.g., always -u
for unified format) to minimize spurious differences caused by formatting variations rather than actual content changes.Interpreting the Output
The output of diffing two patches can sometimes be verbose, especially if there are many differences. Focus on the lines prefixed with +
and -
within the 'diff of diffs' output. These lines indicate where the content of the changes differs between your two patch files. For example:
--- patch1.diff 2023-10-26 10:10:00.000000000 -0700
+++ patch2.diff 2023-10-26 10:11:00.000000000 -0700
@@ -3,7 +3,7 @@
--- a/file.txt 2023-10-26 10:00:00.000000000 -0700
+++ b/file.txt 2023-10-26 10:05:00.000000000 -0700
@@ -1,4 +1,4 @@
This is line 1.
- This is line 2 (old).
- This is line 2 (new).
+ This is line 2 (newer).
This is line 3.
This is line 4.
In this example, the outer diff
shows that patch1.diff
contained + This is line 2 (new).
(indicated by -
in the outer diff), while patch2.diff
contained + This is line 2 (newer).
(indicated by +
in the outer diff). This clearly shows the change in the added line between the two patches.
--- a/file.txt
and +++ b/file.txt
) and hunk headers (@@ -x,y +a,b @@
) might also differ due to timestamps or line number changes, even if the actual code changes are identical. You might need to filter these out or use diff
options like --ignore-matching-lines
if they are not relevant to your comparison.Advanced Use Cases and Considerations
While direct comparison is powerful, sometimes you need more nuanced analysis. For instance, if you want to compare the effect of two patches rather than their literal text, you might apply one patch, then generate a new diff against the original, and compare that with the second patch. This is more complex and often involves temporary directories and careful file management.
1. Prepare Original Files
Ensure you have the original, unpatched versions of the files that both patches are intended to modify.
2. Apply First Patch
Apply patch1.diff
to a copy of your original files. This creates a 'version A' of the modified files.
3. Apply Second Patch
Apply patch2.diff
to another copy of your original files. This creates a 'version B' of the modified files.
4. Compare Modified Files
Use diff -u versionA/file.txt versionB/file.txt
to see the differences between the results of applying each patch. This shows the net effect of each patch.