This article is a systems-engineering analysis for educational purposes only. It is not financial advice, trading advice, or a recommendation to buy or sell any instrument.

Market Data Feed Handler Architecture: ITCH and the Order Book (2026)

Last Updated: 2026-06-08

Market data feed handler architecture is one of those subsystems that looks deceptively small on a whiteboard and turns out to be where a trading platform earns or loses its reputation for correctness. A feed handler sits at the edge of the stack, takes a firehose of binary exchange messages off the wire, and turns it into a clean, ordered, in-memory picture of every order book the firm cares about. Get it right and everything downstream — pricing, risk, execution analytics — inherits a trustworthy view of the market. Get it subtly wrong, drop a sequence number here, mishandle a snapshot there, and the errors are silent, intermittent, and maddening to trace. This article is a systems deep-dive into how a modern ITCH feed handler is built. It is engineering, not trading advice: we treat the exchange feed as a data source, the order book as a data structure, and the whole path as a system with nameable failure modes.

What this covers: what a feed handler does in the stack; the receive, decode, and book-build pipeline; ITCH message semantics; order book reconstruction including snapshots and recovery; performance and correctness techniques; the gotchas that bite in production; and a practical checklist.

Background: What a Feed Handler Does

A market data feed handler is the software component that consumes an exchange’s raw data feed and presents it to the rest of the trading system as structured, queryable state. Exchanges do not hand you a tidy order book over HTTP. They broadcast a continuous stream of small binary messages — an order was added, an order was cancelled, a trade printed — and it is the feed handler’s job to receive those messages losslessly, decode them, and apply them in order to reconstruct the book.

Figure 1 — The feed handler sits at the boundary, converting exchange wire messages into in-memory book state that strategy, risk, and analytics consume.

The dominant protocol family for this kind of full-depth equity data is ITCH, NASDAQ’s direct data-feed protocol. ITCH is binary, sequenced, and delivered over UDP multicast. Each message is a fixed-layout struct with a single-byte type code, and the feed carries every order-book event for the venue — additions, executions, cancels, deletes, and replaces — not just top-of-book quotes. The canonical reference is the NASDAQ TotalView-ITCH 5.0 specification, which documents every message type and field layout.

ITCH has a sibling worth naming for context. OUCH is NASDAQ’s order-entry protocol — the path you use to send orders, not receive market data. The two are designed as a matched pair: OUCH in, ITCH out. They are deliberately separate concerns, and a feed handler deals only with the inbound ITCH side.

It helps to contrast ITCH with the other big protocol lineage. FIX (Financial Information eXchange) is a tag-value, session-oriented protocol designed for order routing and is verbose by nature. FAST (FIX Adapted for STreaming) compresses FIX for market data using field encoding and templates, and many non-US venues use FAST-encoded feeds over multicast. The trade-off is roughly: ITCH-style fixed-binary feeds are simplest and fastest to decode but venue-specific, while FAST is more portable but needs a template engine. The FIX community documents both; the FIX Trading Community maintains the specifications. A feed handler architecture has to commit to one decoding model up front because it shapes the entire hot path.

Architecture

A feed handler decomposes cleanly into three stages: receive, decode, and book build. Each stage has a distinct concern and a distinct failure mode, and keeping them separate is what makes the system testable. Data flows one way — from wire to book — and the design goal throughout is determinism: the same input bytes must always produce the same book state.

Figure 2 — Receive, decode, book build. Gap detection lives in receive; message semantics in decode; price-level state in book build.

Receive: Multicast, Sequencing, and A/B Arbitration

The receive stage owns the network. Exchange feeds arrive as UDP multicast — one-to-many, connectionless, and crucially lossy. UDP gives no delivery guarantee, so the feed handler cannot assume it sees every packet. The protocol compensates with sequence numbers: every message, or every packet, carries a monotonically increasing number. The receive stage tracks the expected next sequence number and compares it against what arrives.

When the arriving sequence number matches the expectation, the message is in order and flows downstream. When it jumps ahead, a gap has occurred — one or more messages were lost — and the handler must trigger recovery before it can safely apply anything past the gap. When it arrives behind the expectation, the message is a duplicate or out-of-order straggler and is discarded. This single piece of bookkeeping, the expected-sequence cursor, is the heart of correctness on the wire.

Because UDP loss is unavoidable, exchanges publish the same data on two independent multicast groups: the A line and the B line. These redundant feeds carry identical sequenced messages over separate network paths. The feed handler performs A/B line arbitration: it consumes both, keys messages by sequence number, and forwards the first copy of each sequence it sees while suppressing the duplicate. A packet dropped on the A line is very often present on the B line, so arbitration turns two lossy feeds into one near-lossless stream. Only when both lines miss a sequence does the handler fall back to the heavier recovery path.

Decode: Binary ITCH Message Types

The decode stage turns bytes into typed events. An ITCH message begins with a one-byte type code that selects the struct layout. The decoder reads the type, casts or maps the following bytes into the right fields, and emits a normalized internal event. Because the layouts are fixed and big-endian, decoding is mostly pointer arithmetic and byte-order swaps — no parsing in the textual sense.

The message types that matter for book building are a small, well-defined set:

Add Order — a new resting order enters the book at a price and size, identified by an order reference number.
Order Executed — a resting order trades, in whole or in part; size is reduced by the executed quantity.
Order Cancel — a resting order’s size is reduced by a stated amount without a full removal.
Order Delete — a resting order is fully removed from the book.
Order Replace — an order is cancelled and re-added with a new reference, price, or size in one atomic event.

There are also administrative and trade-print messages — system event, stock directory, trading action, and non-displayable trade messages — that the handler must recognize even when they do not modify the displayed book. The decoder’s job is to map each type to the smallest faithful internal representation, nothing more.

A worked decode makes the shape concrete. Suppose the receive buffer holds an Add Order message. The decoder reads byte zero, sees the type code for Add Order, and from the spec knows the exact field layout that follows: a stock locate, a tracking number, a timestamp, an order reference number, a buy-or-sell indicator, a share quantity, a stock symbol, and a price. Each field has a fixed offset and width. The decoder does not search or tokenize; it reads at known offsets, byte-swaps the integers from network order, and produces one typed event. Prices on ITCH arrive as scaled integers, so the decoder also applies the venue’s fixed price scale rather than touching floating point on the hot path. Keeping integers as integers until the last possible moment avoids both rounding error and the cost of float conversion per message.

One decode rule prevents a whole class of bugs: never trust a length you have not bounds-checked. A malformed or truncated packet that claims a field extends past the buffer must be rejected, not read. In a fixed-layout protocol this is cheap — compare the declared message length against the bytes actually present before overlaying the struct view. Skipping that check is how a single bad packet turns into an out-of-bounds read in the hottest part of the system.

Book Build: From Events to Price Levels

The book-build stage applies decoded events to maintain the order book. It keeps two core data structures. First, an order map keyed by order reference number, so an execute, cancel, delete, or replace can find the exact resting order it targets. Second, a pair of price-level ladders — bids descending, asks ascending — where each level aggregates total resting size.

Figure 3 — Two structures in lockstep. The order map resolves references to levels; the ladders aggregate size for fast top-of-book reads.

The event handlers are mechanical once the structures are in place. An Add Order inserts into the order map and adds its size to the matching price level, creating the level if it is new. An Order Executed or Order Cancel looks up the reference, subtracts size from its level, and removes the order from the map if it reaches zero. An Order Delete removes the order entirely and decrements its level. An Order Replace is a delete-then-add against a new reference. The invariant that ties everything together: the sum of order sizes pointing at a price level must always equal that level’s aggregate size. Violate it and the book is corrupt.

The reason ITCH carries an order reference number on every message is precisely to make these handlers O(1). An Order Executed message does not restate the price or the side — it names an order reference, and the handler must already know everything about that order from the earlier Add. This is the defining property of a per-order, market-by-order feed: the book is only correct if the order map is complete, which is only true if no Add was missed. A dropped Add followed by an Execute on its reference produces a lookup miss, and how the handler treats that miss — silently ignore, or flag the book as suspect — is a design decision worth making explicitly rather than by accident.

It is also worth being precise about what the price-level ladder stores versus what the order map stores. The order map is the source of truth for individual orders; the ladder is a derived aggregate maintained incrementally so that top-of-book and depth queries are fast. You could in principle recompute the ladder by scanning the order map, but doing so per update would be far too slow, so the ladder is kept in sync on every event. That dual representation is the whole reason the size-conservation invariant exists: it is the assertion that the derived view has not drifted from the truth.

Order Book Reconstruction in Depth

Reconstruction is the part most people underestimate. The feed gives you incremental updates — deltas against the current book — not a fresh snapshot on every change. To hold a correct book you must apply every delta in sequence, with no gaps, starting from a known-good baseline. There are two ways to establish that baseline.

The first is the start-of-day build. At session open the exchange typically emits a system event marking the start, and the book begins empty. From there the handler applies every Add, Execute, Cancel, Delete, and Replace in sequence order, and the book is correct by construction as long as no message is missed. This is the clean case, and it is why sequence integrity matters so much: a single un-recovered gap corrupts the book for the rest of the session.

The second is snapshot recovery, used when you start late, restart, or suffer a gap you cannot fill from the B line. A snapshot is a full picture of the book at a stated sequence number. The handler loads the snapshot, then resumes applying incremental messages from exactly that sequence forward. The delicate part is the join: you must buffer incoming incrementals while the snapshot is fetched and applied, then replay the buffered messages that are newer than the snapshot’s sequence while discarding those at or before it. Off-by-one errors here are a classic source of subtly wrong books.

Figure 4 — One Add Order event traversing receive, decode, and book-build, with the sequence check gating whether it is applied or buffered for recovery.

Recovery itself takes two shapes depending on the venue. Some exchanges run a retransmission service: the handler requests the missing sequence range over a separate channel and the exchange resends those exact messages, after which normal flow resumes. Others publish a periodic snapshot or refresh feed on its own multicast group, and the handler simply waits for the next snapshot, rebuilds, and rejoins the incremental stream. Production handlers usually implement both, preferring fast retransmission for small gaps and full snapshot rebuild for large ones or for cold starts. Either way, while recovery is in flight the affected book must be marked not trustworthy so downstream consumers do not act on a half-built picture.

A subtlety worth stating: maintaining price levels efficiently means the ladder data structure must support fast best-bid/best-ask reads, fast insertion of new levels, and fast removal of emptied ones. The order map must support fast lookup by reference. Reconstruction correctness is a function of both structures staying consistent under every event type, which is why the size-conservation invariant above is worth asserting in test builds.

Performance and Correctness

A feed handler runs hot — millions of messages on a busy session — so the engineering leans heavily on doing less work per message and avoiding anything that stalls the CPU. The numbers below are described as typical or illustrative directions, not benchmarks; actual figures depend entirely on hardware, venue, and load.

Zero-copy parsing is the first lever. Because ITCH messages are fixed-layout, the decoder can interpret bytes in the receive buffer in place — overlaying a struct view rather than copying fields into a new object. This avoids per-message allocation, which is typically the largest source of avoidable latency and jitter. Allocation on the hot path is the enemy; steady-state, a good handler allocates nothing.

Cache-friendly structures matter because random memory access is far slower than sequential. Price-level ladders held in contiguous arrays, order maps sized to avoid rehashing, and hot fields packed together all reduce cache misses. The general direction is that a book update touching memory already in cache is dramatically cheaper than one that misses to main memory.

Busy-polling trades CPU for latency determinism. Instead of blocking on a socket and paying wake-up cost, a latency-sensitive handler spins on a dedicated, pinned core, continuously polling for new packets. This burns a core but removes scheduler-induced jitter — the variance that hurts more than raw average latency.

For the lowest-latency tier, kernel-bypass networking moves packet handling out of the OS stack entirely. Frameworks like DPDK and vendor stacks such as Solarflare/Onload deliver packets directly to user space, skipping kernel copies and context switches. The illustrative effect is lower and more deterministic latency, which for a feed handler is the real prize: predictable tail behavior beats a good average. Determinism — same input, same output, same timing envelope — is what makes the system both fast and testable.

Branch behaviour is the next lever once allocation and copies are gone. The decode path is a switch on a message type, and an unpredictable branch there can stall the pipeline. Practitioners often order the switch by message frequency, since Add, Execute, Cancel, and Delete dominate the volume and the administrative types are rare. Some go further and use jump tables keyed directly on the type byte so the dispatch is a single indirect call rather than a chain of comparisons. The general principle is that the most common message must take the shortest path through the code, because in a feed where 99 percent of messages are book updates, the rare path almost never matters.

A final correctness-flavoured performance note: determinism is not just a latency goal, it is a testability goal. If the same captured session always replays to the same final book, byte for byte, then you can regression-test the entire handler by diffing reconstructed books against a golden output. That property is worth protecting jealously. Anything non-deterministic on the hot path — a hash map with randomized iteration order that leaks into output, a timestamp read where a sequence number belongs, an uninitialized field — breaks replay testing, and a feed handler you cannot replay-test is a feed handler you cannot trust under change.

Trade-offs, Gotchas, and What Goes Wrong

The failure modes of a feed handler are specific and worth naming, because most production incidents are one of a short list.

Figure 5 — Each failure mode maps to a defined recovery action. The dangerous case is applying past an un-recovered gap.

Gaps and drops are the headline risk. A lost sequence that is not recovered means every subsequent delta is applied to a wrong baseline, silently corrupting the book. The defense is strict sequence checking plus A/B arbitration plus recovery — and refusing to apply anything past a detected gap until it is filled.

Out-of-order and duplicate packets are normal on UDP and harmless if handled: the sequence cursor discards duplicates and stragglers and only advances on the expected next number. The bug is forgetting to dedupe across the A and B lines and double-applying an event.

Snapshot synchronization is the off-by-one minefield described earlier. Replaying one incremental too many, or one too few, against a snapshot yields a book that is wrong by exactly one event — hard to spot and easy to ship.

Clock and timestamping issues arrive when you assume the exchange timestamp, the capture timestamp, and your local clock are interchangeable. They are not. Latency measurement and event ordering across venues depend on knowing which clock you are reading.

Normalization across venues — survivorship and symbology — bites multi-exchange systems. Different venues use different reference-number semantics, tick rules, and symbols, so a handler that normalizes into one internal model must do so without losing venue-specific truth.

Practical Recommendations

Treat the feed handler as a correctness system first and a performance system second. A fast handler that occasionally ships a corrupt book is worse than a slightly slower one that never does. Build the sequence-integrity machinery before you optimize the hot path, and make book corruption loud, not silent.

Assert the size-conservation invariant in test and staging builds: per-order sizes at a level must sum to the level aggregate.
Mark books not-trustworthy during recovery and have downstream consumers honor that flag.
Implement both retransmission and snapshot rebuild, choosing by gap size and cold-start state.
Dedupe across A/B lines by sequence number, never by content.
Decode zero-copy and allocate nothing on the steady-state hot path.
Pin and isolate the receive core if you busy-poll; measure tail latency, not just the average.
Record three timestamps per message — exchange, capture, local — and never conflate them.
Replay-test against captured sessions so reconstruction is deterministic and regression-checked.

FAQ

What is the difference between ITCH and FIX?
ITCH is a binary, sequenced market-data feed delivered over UDP multicast, carrying full-depth order-book events for a venue. FIX is a session-oriented, tag-value protocol used mainly for order routing. ITCH is faster to decode but venue-specific; FIX is portable and human-readable but verbose. A feed handler consumes ITCH-style feeds; FIX is more common on the order-entry path.

Why is market data sent over UDP instead of TCP?
UDP multicast lets the exchange broadcast one stream to many subscribers efficiently, without per-consumer connection state. The cost is that UDP can drop packets. Feeds compensate with sequence numbers, redundant A/B lines, and recovery services, which together give near-lossless delivery without TCP’s per-receiver overhead and head-of-line blocking.

What are A and B lines in a market data feed?
They are two independent multicast streams carrying identical sequenced messages over separate network paths. The feed handler consumes both and keeps the first copy of each sequence number it sees. Because a packet dropped on one line is usually present on the other, arbitrating across A and B turns two lossy feeds into one reliable stream.

How does a feed handler recover from a dropped message?
First it detects the gap via a sequence-number jump and stops applying further updates to the affected book. Then it either requests a retransmission of the missing range or waits for a periodic snapshot, rebuilds the book from that baseline, and replays buffered incrementals from the correct sequence forward before marking the book trustworthy again.

What is order book reconstruction?
It is the process of maintaining an in-memory order book by applying every incremental message — add, execute, cancel, delete, replace — in sequence to a known-good baseline. The baseline comes either from a start-of-day empty book or from a snapshot. Correctness depends on missing no messages and applying them in exact order.

Is a feed handler the same as a matching engine?
No. A matching engine lives inside the exchange and decides which orders trade. A feed handler lives at the subscriber and reconstructs a read-only view of the book the exchange publishes. The feed handler never matches or executes; it only rebuilds and exposes state.

Market Data Feed Handler: ITCH & Order Book (2026)

Market Data Feed Handler Architecture: ITCH and the Order Book (2026)

Background: What a Feed Handler Does

Architecture

Receive: Multicast, Sequencing, and A/B Arbitration

Decode: Binary ITCH Message Types

Book Build: From Events to Price Levels

Order Book Reconstruction in Depth

Performance and Correctness

Trade-offs, Gotchas, and What Goes Wrong

Practical Recommendations

FAQ

Further Reading

Related

Comments

Leave a Reply Cancel reply

Tag Cloud

Categories