firmware-formats
theoryLesson 1 — Firmware Formats
What Firmware Actually Is
Firmware is the software permanently stored on a device's flash memory. On any embedded Linux device — a home router, an IP camera, a smart thermostat — firmware is a single binary blob that the bootloader reads from flash and executes. That blob typically contains several distinct regions concatenated together:
[bootloader] [kernel] [root filesystem] [config partition]
The bootloader (U-Boot is the most common on embedded Linux) initializes the hardware and hands off to the kernel. The kernel mounts the root filesystem. The config partition holds user settings that persist across reboots. All of this lives in the same .bin file you download from the vendor's support page.
Understanding this layout is the first step: you are not looking at a single executable, you are looking at a filesystem image — possibly several — packed into one file.
Common Formats and Containers
Raw Binary
No container, no header. The regions are concatenated and the bootloader knows the offsets by convention or a device-specific partition table. Common on older devices. binwalk will detect the inner formats even without a container header.
TRX (Broadcom)
Used by Broadcom-based routers (many consumer-grade devices). Starts with the ASCII magic HDR0. Contains a kernel and a rootfs partition. binwalk recognizes it natively.
UBI / UBIFS
Used on devices with NAND flash, which requires wear leveling. UBI (Unsorted Block Images) is a layer on top of raw NAND. UBIFS is the filesystem that sits on UBI. More complex to extract — you need ubireader or ubi_reader tools. Common on newer routers and set-top boxes.
SquashFS
A compressed, read-only filesystem. The most common rootfs format in embedded Linux firmware. Identified by magic bytes sqsh (little-endian) or hsqs (big-endian). Versions 3.x and 4.x are both common. Nearly every consumer router uses SquashFS for its root filesystem.
JFFS2 (Journaling Flash File System 2)
Older read-write filesystem for NOR flash. Magic bytes 0x1985. Still present on older devices and sometimes used for the config partition on newer ones.
CRAMFS
Compressed ROM filesystem. Predecessor to SquashFS, largely replaced but still present on very old devices. Magic 0x28cd3d45.
Proprietary Containers
Some vendors wrap everything in a vendor-specific container before the standard formats begin. Common examples: ASUS uses a custom header before TRX; TP-Link uses a header with an MD5 checksum. binwalk often handles these, but you may need to manually skip the vendor header.
Spotting the Format
You never trust the file extension. Firmware is often named firmware_v2.3.1.bin regardless of what it actually contains. Three tools to identify what you have:
# Ask the OS
file firmware.bin
# Scan for known magic bytes and compression signatures
binwalk firmware.bin
# Visualize entropy distribution
binwalk -E firmware.bin
file uses magic byte detection from /usr/share/magic. It identifies the outermost format only.
binwalk scans the entire file for known signatures at every offset, not just the start. A typical router firmware output looks like:
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 TRX firmware header, little endian, ...
28 0x1C LZMA compressed data, ...
1163380 0x11C074 Squashfs filesystem, little endian, version 4.0, ...
This tells you exactly where each component starts.
Entropy Visualization
binwalk -E firmware.bin generates an entropy graph. Entropy measures randomness on a 0.0–1.0 scale:
- Low entropy (0.0–0.4): Plaintext, code, structured data. Normal.
- High entropy with variation (0.7–1.0, not flat): Compressed data. Normal — compressed data looks random.
- High entropy, completely flat at ~1.0: Encrypted data. The distribution of byte values is uniform, which is what encryption produces.
If the entire firmware shows flat high entropy with no recognizable magic bytes, the firmware is encrypted. Extraction workflow does not apply — you need the key first.
Endianness
MIPS processors (common in older Broadcom, Atheros, and MediaTek routers) are often big-endian. ARM (common in newer devices) is usually little-endian. x86 embedded systems are little-endian.
This matters because:
- SquashFS magic bytes differ:
sqshis little-endian,hsqsis big-endian.binwalkdetects both. stringsoutput may contain double-byte garbage if you run it against big-endian UTF-16 strings.- When manually inspecting headers with a hex editor, multi-byte values are reversed.
- Some
unsquashfsversions fail on big-endian images; you may need to specify the byte order explicitly.
You can determine endianness early by checking the kernel header or ELF headers of any binaries inside the filesystem after extraction.
Encrypted vs. Unencrypted
The simplest test: run binwalk firmware.bin. If you see recognizable magic bytes (kernel signatures, SquashFS magic, compression headers), the firmware is not encrypted. If you see nothing but the file is multi-megabyte, it is likely encrypted.
Secondary confirmation: binwalk -E firmware.bin. A flat entropy curve from byte 0 to end, hovering near 1.0, is a near-certain indicator of encryption.
Encrypted firmware requires: 1. Finding the decryption key (sometimes leaked in older firmware versions) 2. Identifying the cipher (often AES-CBC or AES-ECB — check if vendor documentation mentions it) 3. Decrypting before any analysis is possible
Some vendors introduced encryption in newer firmware versions while older versions were plaintext. Always check if an older firmware version is available — it often contains the same code without encryption.