Anatomy of a Limit Order Book: Building Price-Time Priority in C++
A limit order book (LOB) is the central data structure of every modern electronic exchange. It is, at its heart, a sorted ledger of intentions: buyers posting the most they’ll pay (bids) and sellers posting the least they’ll accept (asks). The matching engine’s only job is to answer one question, millions of times a second, with deterministic fairness: when a new order arrives, which resting order does it trade against first?
Two sides, many price levels
The book has two sides. The bid side is sorted in descending price order — the highest bid sits at the top because it is the most aggressive buyer. The ask side is sorted ascending — the lowest ask is the most aggressive seller. The gap between the best bid and best ask is the spread, and the midpoint between them is the price most people quote as “the price.”
Each price level is not a single order. It is a queue of orders resting at that exact price, and the order of that queue is where the interesting engineering lives.
Why price-time priority
Exchanges overwhelmingly use price-time priority (also called FIFO). The rule is simple:
- Better price always wins. A bid at 100.50 fills before a bid at 100.49.
- At the same price, the order that arrived first fills first.
That second rule is what makes a LOB fair and also what makes it hard. You cannot just keep a set of orders per price — you need a queue, and you need to be able to cancel an order from the middle of that queue in O(1) when a trader pulls their quote.
Data structure choices
The naive approach is std::map<Price, std::list<Order>>. The map keeps price levels sorted; the list gives you FIFO per level. It works, and it’s a reasonable first implementation. But std::map is a red-black tree: every best-bid/best-ask lookup chases pointers across cache lines, and tree rebalancing on insert is unpredictable.
For latency-sensitive matching, the better design is a flat array of price levels indexed by ticks away from a reference price. Price 100.50 with a 0.01 tick becomes index (10050). Best bid/ask lookups become an array scan from a cached pointer — branch-predictable, cache-friendly, and O(1) amortised. Each level holds an intrusive linked list of orders so cancels are O(1), and a separate hash map from orderId → node* lets you find any order instantly.
The matching algorithm
When a marketable limit buy arrives, the engine walks the ask side from the best price up, filling against each resting order in queue order, decrementing volume, removing fully-filled orders, and emitting trade events — until either the incoming order is exhausted or the price is no longer marketable. Whatever remains rests on the book as a new bid.
Why cache locality is the whole game
At HFT timescales, a single L2 cache miss (~10ns) can dwarf the actual matching logic. This is why the array-of-levels design beats the tree: the hot path — read best level, match, update — touches contiguous memory. Zero allocation on the critical path, intrusive lists instead of std::list’s node allocations, and you keep the engine single-threaded so there are no locks to contend. Determinism is a feature: the same input sequence must always produce the same trades, which is also what makes the engine testable.
The LOB simulation on the homepage of this site is a toy reflection of exactly this structure — six levels a side, a walking mid-price, and depth bars proportional to resting volume.