cmp Command: Tutorial & Examples
Compare two files byte by byte and locate the first difference
The cmp command is a fundamental Linux utility used to compare two files at the byte level to determine if they are identical or where they first differ. It
is particularly useful for verifying file integrity, comparing binary files, and automating checks in server and virtualization environments. This article
provides a detailed overview of cmp, including its functionality, parameters, practical examples, potential issues, and best practices for effective usage.
What cmp Does
The cmp command compares two files byte by byte and reports the position of the first difference found. Unlike line-oriented tools such as
diff, cmp focuses on binary-level differences, making it suitable for any file type, including executables, images, archives, or any
arbitrary data file.
If the files are identical, cmp produces no output and returns an exit status of 0. When differences exist, it outputs the byte offset and line number of the
first mismatch and returns a non-zero exit status. This behavior allows cmp to be easily integrated into automated workflows and scripts for verifying file
consistency without verbose output.
Why cmp Is Important
In server administration, virtualization, and data management, ensuring file integrity is critical. cmp provides a reliable way to detect file corruption,
unsynchronized backups, or unauthorized changes by comparing files precisely at the byte level.
Its ability to handle binary files directly distinguishes it from text-based comparison tools. Moreover, the silent mode (-s) makes it efficient for scripting
scenarios where only the comparison result matters, minimizing noise in logs or output.
How cmp Works
cmp opens both files and reads them sequentially, comparing each corresponding byte until it encounters a difference or reaches the end of both files. It
keeps track of:
- Byte offsets: the number of bytes from the beginning of the files.
- Line numbers: incremented on each newline character (
\n) encountered.
When a difference is found, cmp reports the byte position and line number of the first mismatch. If one file is shorter but otherwise identical to the
beginning of the longer file, cmp reports the end of the shorter file as the difference point.
The exit status codes are as follows:
0: Files are identical.1: Files differ.2: An error occurred (e.g., file not found or permission denied).
This precise reporting and exit status make cmp useful in conditional scripting.
Common Parameters of cmp
The most frequently used options are:
-l
List all differing byte positions and display their values in octal for both files.-s
Silent mode; suppresses all output. Only the exit status indicates whether files match.-i SKIP
Skip the firstSKIPbytes in both files before starting the comparison.-n LIMIT
Compare at mostLIMITbytes.--help
Display help and usage information.--version
Show version information of thecmpcommand.
These options enhance flexibility, enabling partial comparisons, quiet checks, or detailed difference reports.
Basic Usage Examples
Compare two binary files file1.bin and file2.bin:
cmp file1.bin file2.bin
If the files are identical, there is no output, and the exit status is:
echo $?
0
If they differ, output displays the first differing byte and line number:
file1.bin file2.bin differ: byte 15, line 1
Check the exit status in a script to detect differences:
cmp file1.bin file2.bin
if [ $? -ne 0 ]; then
echo "Files differ"
else
echo "Files are identical"
fi
List all byte differences with their octal values:
cmp -l file1.bin file2.bin
Sample output:
15 141 142
20 170 171
This indicates that at byte 15, the first file has octal value 141, and the second has 142, and similarly at byte 20.
Example of handling a missing file or permission error:
cmp missingfile.bin file2.bin
cmp: missingfile.bin: No such file or directory
echo $?
2
Advanced Usage Examples
Skip the first 100 bytes in both files before comparing:
cmp -i 100 file1.bin file2.bin
Compare only the first 256 bytes:
cmp -n 256 file1.bin file2.bin
Use silent mode in scripts to check if files match without producing output:
if cmp -s file1.bin file2.bin; then
echo "Files match"
else
echo "Files differ"
fi
Compare two large log files ignoring initial metadata (e.g., timestamp headers):
cmp -i 1024 /var/log/app1.log /var/log/app2.log
This is useful when headers differ but main content should be identical.
Example script snippet to automate backup verification:
BACKUP=/backup/config.bak
ORIGINAL=/etc/config
if cmp -s "$BACKUP" "$ORIGINAL"; then
echo "Backup verified"
else
echo "Backup differs from original!"
fi
Performance Considerations
Comparing very large files byte by byte can be time-consuming. To optimize:
- Use the
-noption to limit comparison to a relevant subset of bytes. - Use checksum tools like
md5sumorsha256sumto quickly detect differences before runningcmp. - For extremely large files, consider sampling or specialized tools designed for performance.
Security Considerations
- Ensure you have appropriate read permissions on both files; otherwise,
cmpwill return an error. - Avoid comparing sensitive files in environments where output or logs might be exposed.
- Use silent mode (
-s) in automated scripts to prevent potentially sensitive data from appearing in logs. - Be aware that comparing files with different encodings or encrypted content may produce unexpected results.
Potential Problems and Troubleshooting
- No output despite differences: If running
cmpwithout options produces no message, verify the exit status. A0exit code means files are identical; a1means files differ. - Permission errors: If you get "Permission denied," check file permissions and run as an appropriate user.
- File not found errors: Ensure the specified file paths are correct.
- Confusing output for text files:
cmpreports byte offsets and line numbers, which may be less intuitive than line-based differences. Usedifffor text comparison. - Binary files with embedded null bytes:
cmphandles these correctly, but output may be confusing if interpreted as text. - Text encoding differences: Different encodings (UTF-8 vs UTF-16) will cause
cmpto report differences even if content appears similar.
Tips and Best Practices
- Use
cmp -sin scripts for efficient, quiet checks relying on exit codes. - Combine
cmpwith hash utilities likemd5sumorsha256sumfor faster pre-checks. - When comparing files with headers or metadata, use
-ito skip irrelevant sections. - Use
-lto get a detailed list of all differing bytes when debugging. - Remember that
cmpcompares bytes literally; differences in text encoding or line endings will show as mismatches. - Check exit codes carefully to distinguish between identical files, differences, and errors.
- Use
cmpin automation scripts to verify backups, deployments, or configuration consistency.
Real-World Use Cases
- Backup Verification: Confirm that copied files are identical to originals after backups.
- Configuration Drift Detection: Detect unauthorized or accidental changes in server config files.
- Binary Patch Validation: Ensure patches modify only intended bytes in compiled binaries.
- Automated Testing: Compare program output files or logs against expected results.
- Virtual Machine and Container Image Integrity: Verify disk images or container layers for corruption or unexpected changes.
See Also
Further Reading
- Bash Cookbook by Carl Albing, J.P. Vossen (partner link)
- Wicked Cool Shell Scripts by Dave Taylor, Brandon Perry (partner link)
- Black Hat Bash by Nick Aleks, Dolev Farhi (partner link)
- Bash Pocket Reference by Arnold Robbins (partner link)
- The Linux Command Line by William Shotts (partner link)
- Learning the Bash Shell by Cameron Newham (partner link)
- Mastering Linux Shell Scripting by Mokhtar Ebrahim, Andrew Mallett (partner link)
- Linux Command Line and Shell Scripting Bible by Richard Blum, Christine Bresnahan (partner link)
- Shell Scripting by Jason Cannon (partner link)
As an Amazon Associate, I earn from qualifying purchases.