binwalk-extraction
practicalLesson 3 β Binwalk Extraction
Reading Binwalk Output
binwalk scans every byte offset in a file against a database of known file signatures. Run it plain first to understand the structure before extracting:
binwalk firmware.bin
Sample output from a typical consumer router firmware:
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 TRX firmware header, little endian, image size: 7667712 bytes, ...
28 0x1C LZMA compressed data, properties: 0x5D, dictionary size: 8388608 bytes, ...
1163380 0x11C074 Squashfs filesystem, little endian, version 4.0, compression: lzma, ...
What each field means:
- DECIMAL / HEXADECIMAL: Byte offset from the start of the file.
0x11C074means the SquashFS image starts at byte 1,163,380. - DESCRIPTION: The detected format and key parameters. Note
little endianβ this determines whichunsquashfsoptions to use.
If you see overlapping offsets or the same format detected twice at nearby offsets, you likely have a nested compression situation. Note each offset before extracting.
Recursive Extraction
binwalk -e firmware.bin
This creates a directory named _firmware.bin.extracted/ in the current working directory. Inside:
_firmware.bin.extracted/
βββ 1C # Raw LZMA data extracted from offset 0x1C
βββ 1C.7z # 7zip archive of the LZMA stream (binwalk artifact)
βββ 11C074.squashfs # Raw SquashFS image extracted from offset 0x11C074
βββ squashfs-root/ # Automatically unsquashed filesystem
βββ bin/
βββ etc/
βββ lib/
βββ usr/
βββ www/
The squashfs-root/ directory is a complete embedded Linux root filesystem. This is what you analyze.
Recursive mode (-r or --rm): Tells binwalk to recursively extract any archives found inside extracted files. Useful for nested compression, but can create very deep directory trees:
binwalk -eM firmware.bin # -M = Matryoshka / recursive
Use -M when the plain -e output still contains compressed blobs rather than a filesystem.
What Can Go Wrong
Wrong Endianness Detection
Binwalk may extract a SquashFS with the wrong endianness, causing unsquashfs to fail. If you see:
FATAL ERROR: Magic mismatch: read 73717368, expected 73717368 (little endian)
The image is big-endian. Extract manually:
# Get the raw SquashFS blob first
dd if=firmware.bin bs=1 skip=1163380 of=rootfs.squashfs
# Try both endiannesses
unsquashfs -d output_le rootfs.squashfs
unsquashfs -d output_be -be rootfs.squashfs
If you are unsure of the exact offset, use:
dd if=firmware.bin bs=1 skip=<DECIMAL_OFFSET> | file -
This pipes from the offset and lets file identify the format.
Encrypted or Proprietary Compression Sections
Binwalk reports something at an offset but extraction produces garbage or nothing:
# Manually extract the blob and test
dd if=firmware.bin bs=1 skip=<offset> count=<size> of=blob.bin
file blob.bin
binwalk -E blob.bin # check entropy
If entropy is flat at ~1.0, that section is encrypted. Skip it and continue with other detected offsets.
Proprietary SquashFS Modifications
Some vendors patch SquashFS with custom compression or modified magic bytes. unsquashfs from the distro package manager may refuse the image. In that case:
# sasquatch is a patched unsquashfs that handles vendor variants
# Install from: https://github.com/devttys0/sasquatch
sasquatch -d output rootfs.squashfs
firmware-mod-kit and FACT (Firmware Analysis and Comparison Tool) also bundle patched extraction tools for vendor variants.
JFFS2 Filesystems
If binwalk reports JFFS2 filesystem instead of SquashFS:
# jefferson is the standard JFFS2 extractor
# Install: pip install jefferson
jefferson rootfs.jffs2 -d output/
# Alternative: mount via kernel (requires root)
modprobe mtdram total_size=65536 erase_size=256
modprobe mtdblock
dd if=rootfs.jffs2 of=/dev/mtdblock0
mount -t jffs2 /dev/mtdblock0 /mnt/jffs2
UBI / UBIFS Filesystems
# ubireader_extract_images from the ubireader package
ubireader_extract_images -o output/ ubi_image.bin
# Or for UBIFS directly
ubireader_extract_files -o output/ ubifs_image.bin
Entropy Analysis
Run entropy analysis to understand the firmware layout before extraction. It is faster than blind trial and error:
binwalk -E firmware.bin
This generates a PNG entropy graph. Interpret sections:
| Entropy pattern | Meaning |
|---|---|
| ~0.0β0.4, relatively flat | Code or structured data (U-Boot, device tree) |
| ~0.7β1.0, varying | Compressed data (kernel, SquashFS) β expected |
| ~1.0, completely flat | Encrypted data β no extraction possible without key |
| Alternating low/high blocks | Multiple sections of different types β good, you can target each |
A mixed entropy graph tells you extraction will be productive. A uniformly high flat graph tells you to stop and find the key first.
When Binwalk Finds Nothing
If binwalk firmware.bin returns no results on a non-trivial file size:
# Check if the file is actually what you think it is
wc -c firmware.bin # Confirm non-zero size
xxd firmware.bin | head # Look at raw hex β is it ASCII? Is it all zeros?
# Try different binwalk signature databases
binwalk --magic=/usr/share/magic firmware.bin
# Scan for strings as a fallback
strings -n 8 firmware.bin | head -40
# If you see readable strings, it is not encrypted β binwalk just missed the format
# Common fallback: look for ELF headers, gzip magic (1f 8b), or zip local file headers
xxd firmware.bin | grep "1f 8b" # gzip
xxd firmware.bin | grep "50 4b" # zip/jar
Post-Extraction Sanity Check
After extraction, confirm you have a real filesystem:
ls _firmware.bin.extracted/squashfs-root/
# Expected: bin dev etc lib proc sbin sys tmp usr var www
# Confirm it is an ARM or MIPS root, not an x86 one
file _firmware.bin.extracted/squashfs-root/bin/busybox
# Expected: ELF 32-bit MSB executable, MIPS, MIPS32 rel2 version 1 (SYSV)...
# or: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV)...
If bin/busybox is missing, you may have extracted a config partition rather than the rootfs. Go back to the binwalk output and target the correct offset.