Fundamentals of Network Scanning Theory and Nmap Architecture

TCP/IP Stack Behavior: The Foundation of Every Scan

Network scanning is, at its core, an exercise in protocol manipulation. To understand why Nmap behaves as it does, you must first internalize how the TCP/IP stack processes packets at the wire level.

The TCP three-way handshake (SYN → SYN-ACK → ACK) exists to establish reliable, stateful connections. Each packet carries six critical flags: SYN (synchronize), ACK (acknowledge), RST (reset), FIN (finish), PSH (push), and URG (urgent). The state machine governing these transitions is where scanning opportunities emerge.

Consider the canonical state diagram: a closed port responds to any unexpected SYN with RST|ACK, while an open port responds with SYN|ACK. This asymmetry is exploitable. A host receiving a SYN|ACK it never initiated (the "simultaneous open" or, more commonly, a response to a crafted probe) replies with RST—behavior Nmap leverages for inverse scans. Similarly, RFC 793 mandates that closed ports must reply with RST to unsolicited FIN, URG, or PSH packets, while open ports should silently drop these. This "FIN scan" behavior, however, varies across operating systems (Windows stacks often deviate), creating both fingerprinting opportunities and scan reliability trade-offs.

The TCP state machine becomes relevant when examining Nmap's raw packet advantages. In a standard connect() operation, the kernel's TCP stack manages the full handshake, tracks sequence numbers, and maintains socket state in the operating system's tables. When Nmap uses raw sockets (privileged mode), it bypasses this entirely, constructing packets byte-by-byte and interpreting responses itself. This distinction is not merely architectural—it determines what scans you can perform and what information you can extract.

Nmap's Core Architecture: Four Interdependent Engines

Nmap is not a monolithic scanner but a coordinated system of specialized subsystems:

The Scan Engine orchestrates probe generation and response interpretation. It manages host groups (parallelization units), selects probe types based on scan flags, and maintains the state machine for target ports. The engine's efficiency stems from its non-blocking I/O model—it does not wait for responses serially but fires probes according to timing templates and adapts based on network feedback.

The NSE (Nmap Scripting Engine) extends Nmap beyond port state detection. Written in Lua, NSE scripts execute during or after the scanning phase, performing service-specific probing, vulnerability detection, and even exploitation. Critically, NSE operates with access to scan results: scripts can target only open ports or run host-based regardless of port state, depending on their prerule, hostrule, portrule, or postrule classification.

The OS Fingerprinting Engine performs two distinct functions: TCP/IP stack fingerprinting (comparing probe responses against nmap-os-db) and version detection (-sV). The former sends up to 16 carefully crafted probes—varying TCP options, window sizes, fragmentation behavior, and ICMP/TCP/IP field combinations—to elicit implementation-specific quirks. The latter probes open ports with service-specific payloads and matches banners against nmap-service-probes.

The Timing Subsystem governs the entire orchestration. It implements congestion-avoidance logic (adaptive parallelism based on RTT measurements, packet loss detection, and rate limiting), translating abstract templates (-T0 through -T5) into concrete parameters: min-rtt-timeout, max-retries, max-scan-delay, and parallelism.

Port State Taxonomy: Precision in Ambiguity

Nmap categorizes port states with deliberate granularity that reflects network reality, not just binary open/closed:

| State | Definition | Diagnostic Implication | |-------|-----------|------------------------| | open | Service accepting connections | Confirmed accessible; proceed to service detection | | closed | Port accessible but no service listening | Firewall absent; host is reachable | | filtered | Probe blocked; no response or ICMP admin-prohibited | Firewall or ACL intervening; state indeterminate | | unfiltered | Port responds but state ambiguous (ACK scan) | Rare state; typically requires follow-up | | open|filtered | No response to stealth probe (SYN, FIN, NULL, Xmas) | Ambiguity from firewall dropping probes or open port with no response | | closed|filtered | No response to ACK probe | Rare; indicates filtered or unusual stack behavior |

The open|filtered ambiguity is particularly instructive: when a UDP, FIN, NULL, or Xmas scan yields no response, Nmap cannot distinguish between a filtered port (firewall dropped the probe) and an open port that silently accepted it. This is why these scans are typically faster but less definitive than SYN scans.

Scan Phases: Orchestrated Information Gathering

Nmap executes reconnaissance in discrete, though partially overlapping, phases:

Host Discovery (-sn, default with most scans): Determines target liveness via ICMP echo, TCP SYN/ACK to port 443, TCP SYN to port 80, and ICMP timestamp. Skipped entirely with -Pn (treat all targets as online). This phase prunes the target list before port scanning investment.
Port Scanning: The core probe phase. Techniques range from -sT (connect) through -sS (SYN stealth) to exotic variants (-sF, -sX, -sN, -sA, -sW, -sM). Each exploits specific TCP/IP behaviors.
Service and Version Detection (-sV): Active probing of open ports with protocol-specific payloads to extract banner information and match against fingerprint database.
OS Detection (-O): TCP/IP stack fingerprinting requiring at least one open and one closed port for comparison.
NSE Execution: Script-dependent; may run during any phase or post-scan.

Phase separation matters operationally. Host discovery can be disabled (-Pn) when you know targets exist but ping is blocked. Service detection can run without OS detection (-sV without -O), reducing probe volume and stealth footprint.

Packet Crafting: Raw Sockets versus Standard Connections

The distinction between privileged and unprivileged scanning is where theory becomes practice.

Standard Connect Scan (-sT, unprivileged):

# Run as normal user - kernel handles TCP stack
nmap -sT target.example.com

At the wire level, this produces indistinguishable traffic from any application:

Client              Server
  | ---- SYN --------> |
  | <--- SYN/ACK ----- |
  | ---- ACK --------> |
  | ---- ACK/FIN ----> |  [Nmap immediately closes; full handshake completed]

The kernel allocates a full socket, completes the handshake, and Nmap calls close()—sending FIN. This appears in application logs and connection tables. Without root privileges, the kernel prevents raw socket creation; Nmap cannot set arbitrary TCP flags or access response packets before stack processing.

SYN Stealth Scan (-sS, privileged/raw socket):

# Requires root/sudo - Nmap crafts packets directly
sudo nmap -sS target.example.com

Wire-level behavior:

Client              Server
  | ---- SYN --------> |  [Nmap crafts packet with libpcap/DPDK]
  | <--- SYN/ACK ----- |  [Nmap intercepts via packet capture]
  | ---- RST --------> |  [Nmap sends RST to tear down; never completes handshake]

Here Nmap constructs the IP and TCP headers manually, specifying source port, sequence number, TCP options, and flags. Responses are captured via libpcap (bypassing the kernel TCP stack), allowing Nmap to see RST responses that would terminate a normal connection, observe ICMP errors, and detect no-response conditions for filtered state determination.

The raw socket capability enables the full technique spectrum: FIN scans sending packets with no SYN, NULL scans with no flags set, Xmas scans with FIN|PSH|URG lit, ACK scans for firewall rule mapping, and Window scans exploiting TCP window field variations. Each technique relies on specific RFC-mandated (or implementation-specific) stack behaviors that would be impossible to trigger through standard socket APIs.

Understanding this architecture explains why scan selection matters beyond stealth: connect scans work universally but provide limited information and complete handshakes; raw scans require privileges but offer state granularity, speed advantages (half-open connections), and access to responses the kernel would otherwise abstract away. The "magic" of Nmap is not magic at all—it is meticulous, privilege-enabled protocol implementation that treats TCP/IP specifications as a programmable interface rather than a black-box transport service.