mqtt-protocol
theoryL1 — MQTT Protocol Internals
MQTT (Message Queuing Telemetry Transport) was designed in 1999 by IBM engineers working on satellite-linked pipeline monitoring. The constraints were severe: low bandwidth, high latency, unreliable links, devices with kilobytes of RAM. The protocol is still solving that problem today — except now those constrained devices are controlling your home, your factory floor, and your building's HVAC system, and they're connected to the internet.
The Pub/Sub Model
MQTT is a publish/subscribe protocol. This is not the same as HTTP's request/response. There are three roles:
- Publisher — a device that sends data to the broker on a topic
- Broker — the central server that receives all messages and routes them
- Subscriber — a client that tells the broker which topics it wants to receive
The key asymmetry: publishers don't know who is listening, and subscribers don't know who is sending. A temperature sensor publishes 22.4 to home/bedroom/temperature and has no idea whether zero or one hundred clients are subscribed. A subscriber gets the value with no information about which device produced it unless the payload includes that data.
This decoupling is a design feature for scalability. For security, it means a malicious subscriber is completely invisible to the legitimate publishers — you can read everything on a broker without any device knowing you're there.
The Broker
The broker is the router. Every message passes through it. Common production brokers:
| Broker | Language | Typical Deployment |
|---|---|---|
| Mosquitto | C | Embedded systems, Raspberry Pi, small deployments |
| HiveMQ | Java | Enterprise, clustered, commercial |
| EMQX | Erlang | High-throughput, telecom-grade |
| AWS IoT Core | Managed | Cloud-connected device fleets |
| VerneMQ | Erlang | Open-source, scalable |
Mosquitto is by far the most common target you'll encounter on the internet. It ships as the default broker in countless IoT tutorials and gets deployed to production unchanged.
Default configuration of Mosquitto — the one shipped in most Linux package managers — allows anonymous connections with no authentication. This is intentional for development convenience and is a documented behavior. It is also the reason thousands of production brokers are wide open.
Topics: The Namespace
Topics are UTF-8 strings using / as a hierarchy separator. There is no schema enforcement — any client can publish to any topic string. You'll see patterns like:
home/bedroom/temperature
home/bedroom/humidity
home/alarm/armed
factory/line1/machine3/status
factory/line1/machine3/rpm
building/floor3/hvac/setpoint
device/00:1A:2B:3C:4D:5E/telemetry
The hierarchy is purely organizational. The broker doesn't care — it's just string matching.
Wildcards
Two wildcard characters exist and are subscriber-only (you cannot publish to a wildcard topic):
+ — single-level wildcard
Matches exactly one level in the hierarchy:
home/+/temperature
Matches: home/bedroom/temperature, home/kitchen/temperature
Does NOT match: home/bedroom/sensor/temperature (two levels deep)
# — multi-level wildcard
Matches everything from that point down. Must be the last character:
home/#
Matches: home/bedroom/temperature, home/alarm/armed, home/bedroom/sensor/temperature
The attack implication: subscribing to # alone subscribes to every single topic on the broker. One command, total visibility.
QoS Levels
MQTT defines three quality-of-service levels for message delivery:
| Level | Name | Guarantee | Mechanism |
|---|---|---|---|
| QoS 0 | Fire and forget | At most once | No acknowledgment |
| QoS 1 | At least once | At least once | PUBACK required, retransmit if lost |
| QoS 2 | Exactly once | Exactly once | 4-way handshake |
Why attackers love QoS 0: No acknowledgment means no log entry on the broker side that a message was received. No persistent session state. If you're subscribing to a broker to monitor its traffic, QoS 0 subscriptions leave the smallest footprint. The broker has no record of what you received.
QoS 2 is rarely used in IoT devices — the handshake overhead is expensive for constrained hardware and the exactly-once guarantee only matters for financial or actuator-critical messages. Most sensor telemetry uses QoS 0 or 1.
Retained Messages
A retained message is a flag set by the publisher: mosquitto_pub -r. When set, the broker stores the last message on that topic and immediately delivers it to any new subscriber, even if the original publisher is long offline.
Publisher (offline) Broker New Subscriber
[stores last msg]
<---subscribe to topic---
----retained msg---->
This is the device equivalent of a sticky note. Legitimate use: a device publishes its current state as retained so any dashboard connecting later gets the current state without waiting.
The attack implication: retained messages accumulate on poorly managed brokers. Topics that haven't had an active publisher in months still serve cached state to new subscribers — including cached credentials, tokens, and configuration payloads that were published once and forgotten. You can read the history of a broker just by subscribing.
Default Ports
| Port | Protocol | Notes |
|---|---|---|
| 1883 | MQTT plaintext | Default, no encryption |
| 8883 | MQTT over TLS | Encrypted transport |
| 9001 | MQTT over WebSocket | Browser clients, often plaintext |
| 9002 | MQTT over WebSocket TLS | Rare |
Port 1883 is your primary target. Every byte is plaintext: topic names, payloads, and — critically — the MQTT CONNECT packet, which carries the username and password in cleartext.
The Security Model
MQTT's security is entirely optional and layered on top:
- Authentication: Username/password in the CONNECT packet. Disabled by default in Mosquitto.
- Authorization (ACL): Per-user topic access lists. Not configured by default.
- Transport encryption: TLS on port 8883. Not enabled by default.
- Payload encryption: Application-layer, completely up to the developer. Rarely implemented.
The protocol itself has no mandatory security. A device connecting on port 1883 with no credentials is spec-compliant. The broker accepting it is also spec-compliant. This is not a bug — it was designed for isolated networks where the transport itself provided security (a serial link, a private LAN). The security problem comes from exposing these brokers directly to the internet, which happens constantly.
MQTT Packet Structure (What Matters for Analysis)
The CONNECT packet is the handshake. It contains:
- Protocol name and version
- Client ID (often reveals device type:
ESP32_sensor_01,shelly-plug-abc123) - Username (if auth configured)
- Password (if auth configured)
- Clean session flag
- Keep-alive interval
In Wireshark, filter mqtt.msgtype == 1 to isolate CONNECT packets and extract credentials from any capture.
PUBLISH packets contain the topic string and payload. SUBSCRIBE packets contain the topic filter the client registered. All of this is visible in plaintext on port 1883.
Key Takeaways
- Pub/sub means subscribers are invisible to publishers — you can monitor silently
- The
#wildcard gives you everything on a broker in one command - Retained messages leak historical state even when devices are offline
- Port 1883 is plaintext: credentials, topics, and payloads are all readable
- Default Mosquitto allows anonymous connections — this is intentional and common
Next lesson: finding these brokers at scale.