cmp Command: Tutorial & Examples
Compare two files byte by byte and locate the first difference
The cmp
command is a fundamental Linux utility used to compare two files at the byte level to determine if they are identical or where they first differ. It
is particularly useful for verifying file integrity, comparing binary files, and automating checks in server and virtualization environments. This article
provides a detailed overview of cmp
, including its functionality, parameters, practical examples, potential issues, and best practices for effective usage.
What cmp Does
The cmp
command compares two files byte by byte and reports the position of the first difference found. Unlike line-oriented tools such as
diff
, cmp
focuses on binary-level differences, making it suitable for any file type, including executables, images, archives, or any
arbitrary data file.
If the files are identical, cmp
produces no output and returns an exit status of 0. When differences exist, it outputs the byte offset and line number of the
first mismatch and returns a non-zero exit status. This behavior allows cmp
to be easily integrated into automated workflows and scripts for verifying file
consistency without verbose output.
Why cmp Is Important
In server administration, virtualization, and data management, ensuring file integrity is critical. cmp
provides a reliable way to detect file corruption,
unsynchronized backups, or unauthorized changes by comparing files precisely at the byte level.
Its ability to handle binary files directly distinguishes it from text-based comparison tools. Moreover, the silent mode (-s
) makes it efficient for scripting
scenarios where only the comparison result matters, minimizing noise in logs or output.
How cmp Works
cmp
opens both files and reads them sequentially, comparing each corresponding byte until it encounters a difference or reaches the end of both files. It
keeps track of:
- Byte offsets: the number of bytes from the beginning of the files.
- Line numbers: incremented on each newline character (
\n
) encountered.
When a difference is found, cmp
reports the byte position and line number of the first mismatch. If one file is shorter but otherwise identical to the
beginning of the longer file, cmp
reports the end of the shorter file as the difference point.
The exit status codes are as follows:
0
: Files are identical.1
: Files differ.2
: An error occurred (e.g., file not found or permission denied).
This precise reporting and exit status make cmp
useful in conditional scripting.
Common Parameters of cmp
The most frequently used options are:
-l
List all differing byte positions and display their values in octal for both files.-s
Silent mode; suppresses all output. Only the exit status indicates whether files match.-i SKIP
Skip the firstSKIP
bytes in both files before starting the comparison.-n LIMIT
Compare at mostLIMIT
bytes.--help
Display help and usage information.--version
Show version information of thecmp
command.
These options enhance flexibility, enabling partial comparisons, quiet checks, or detailed difference reports.
Basic Usage Examples
Compare two binary files file1.bin
and file2.bin
:
cmp file1.bin file2.bin
If the files are identical, there is no output, and the exit status is:
echo $?
0
If they differ, output displays the first differing byte and line number:
file1.bin file2.bin differ: byte 15, line 1
Check the exit status in a script to detect differences:
cmp file1.bin file2.bin
if [ $? -ne 0 ]; then
echo "Files differ"
else
echo "Files are identical"
fi
List all byte differences with their octal values:
cmp -l file1.bin file2.bin
Sample output:
15 141 142
20 170 171
This indicates that at byte 15, the first file has octal value 141, and the second has 142, and similarly at byte 20.
Example of handling a missing file or permission error:
cmp missingfile.bin file2.bin
cmp: missingfile.bin: No such file or directory
echo $?
2
Advanced Usage Examples
Skip the first 100 bytes in both files before comparing:
cmp -i 100 file1.bin file2.bin
Compare only the first 256 bytes:
cmp -n 256 file1.bin file2.bin
Use silent mode in scripts to check if files match without producing output:
if cmp -s file1.bin file2.bin; then
echo "Files match"
else
echo "Files differ"
fi
Compare two large log files ignoring initial metadata (e.g., timestamp headers):
cmp -i 1024 /var/log/app1.log /var/log/app2.log
This is useful when headers differ but main content should be identical.
Example script snippet to automate backup verification:
BACKUP=/backup/config.bak
ORIGINAL=/etc/config
if cmp -s "$BACKUP" "$ORIGINAL"; then
echo "Backup verified"
else
echo "Backup differs from original!"
fi
Performance Considerations
Comparing very large files byte by byte can be time-consuming. To optimize:
- Use the
-n
option to limit comparison to a relevant subset of bytes. - Use checksum tools like
md5sum
orsha256sum
to quickly detect differences before runningcmp
. - For extremely large files, consider sampling or specialized tools designed for performance.
Security Considerations
- Ensure you have appropriate read permissions on both files; otherwise,
cmp
will return an error. - Avoid comparing sensitive files in environments where output or logs might be exposed.
- Use silent mode (
-s
) in automated scripts to prevent potentially sensitive data from appearing in logs. - Be aware that comparing files with different encodings or encrypted content may produce unexpected results.
Potential Problems and Troubleshooting
- No output despite differences: If running
cmp
without options produces no message, verify the exit status. A0
exit code means files are identical; a1
means files differ. - Permission errors: If you get "Permission denied," check file permissions and run as an appropriate user.
- File not found errors: Ensure the specified file paths are correct.
- Confusing output for text files:
cmp
reports byte offsets and line numbers, which may be less intuitive than line-based differences. Usediff
for text comparison. - Binary files with embedded null bytes:
cmp
handles these correctly, but output may be confusing if interpreted as text. - Text encoding differences: Different encodings (UTF-8 vs UTF-16) will cause
cmp
to report differences even if content appears similar.
Tips and Best Practices
- Use
cmp -s
in scripts for efficient, quiet checks relying on exit codes. - Combine
cmp
with hash utilities likemd5sum
orsha256sum
for faster pre-checks. - When comparing files with headers or metadata, use
-i
to skip irrelevant sections. - Use
-l
to get a detailed list of all differing bytes when debugging. - Remember that
cmp
compares bytes literally; differences in text encoding or line endings will show as mismatches. - Check exit codes carefully to distinguish between identical files, differences, and errors.
- Use
cmp
in automation scripts to verify backups, deployments, or configuration consistency.
Real-World Use Cases
- Backup Verification: Confirm that copied files are identical to originals after backups.
- Configuration Drift Detection: Detect unauthorized or accidental changes in server config files.
- Binary Patch Validation: Ensure patches modify only intended bytes in compiled binaries.
- Automated Testing: Compare program output files or logs against expected results.
- Virtual Machine and Container Image Integrity: Verify disk images or container layers for corruption or unexpected changes.
See Also
Further Reading
- Bash Cookbook by Carl Albing, J.P. Vossen (partner link)
- Wicked Cool Shell Scripts by Dave Taylor, Brandon Perry (partner link)
- Black Hat Bash by Nick Aleks, Dolev Farhi (partner link)
- Bash Pocket Reference by Arnold Robbins (partner link)
- The Linux Command Line by William Shotts (partner link)
- Learning the Bash Shell by Cameron Newham (partner link)
- Mastering Linux Shell Scripting by Mokhtar Ebrahim, Andrew Mallett (partner link)
- Linux Command Line and Shell Scripting Bible by Richard Blum, Christine Bresnahan (partner link)
- Shell Scripting by Jason Cannon (partner link)
As an Amazon Associate, I earn from qualifying purchases.