Trading strategies·mm-information

Order-flow information

◆still alpha

Reviewed 4 June 2026. As of 2026: a real edge still exists for those who can run it well.

Quote around a better fair value inferred from order flow, and manage adverse selection by reading toxicity. Where the modern market-making edge actually sits: microprice, OFI, VPIN.

The idea

The quote is a belief: order flow walks the bid and ask to value annotated diagramDG-FLOWINFO

What this shows. The maker holds a belief about value and updates it on every order: the ask is the value given the next trade is a buy, the bid the value given a sell, and a buyer is on average slightly informed, so the ask sits above the prior and the bid below. Three buys in a row are evidence the asset is worth more, so the posterior on the high value climbs 0.65 to 0.94 and both quotes ratchet toward the hidden 101. That convergence is price discovery – information held by a few traders impounded into the public price through their orders.

What is information-based market making?

Information-based market making is liquidity provision under the assumption that order flow carries information. Instead of quoting symmetrically around a static mid, the maker treats each incoming order as evidence about where value is heading, updates its fair-value estimate, and sets a spread that compensates for the risk that the counterparty knows more than it does.

If you stand ready to buy and sell, the people who choose to trade with you are not a random sample. The trader who lifts your offer just before good news is, on average, better informed than the one rebalancing a pension fund. You cannot tell which is which at fill time, only afterwards, when the price moves. So you must price as if some fraction of your fills are toxic, because they are.

This is the second of the two great market-making lenses. Market Making I (the companion guide and Avellaneda–Stoikov) is the inventory story: you quote a spread, manage the position you accumulate, and survive price risk on your book. Market Making II, this family, is the information story: you quote a spread, manage the risk that the flow knows more than you, and survive adverse selection. Both are always present; a real desk runs both models at once. The defining move is Bayesian: the maker holds a belief about fair value and updates it on every observed order. A buy nudges the belief up, a sell nudges it down. The bid and ask are not fixed offsets from a mid; they are conditional expectations of value given that the next order is a sell, or a buy: exactly the Glosten–Milgrom construction below. The diagram above annotates it: the quote is a belief, updated by the flow.

Why do spreads exist? (Glosten–Milgrom 1985)

Spreads exist because some traders are informed. If a market maker quoted a single price, the informed would always trade against it and the maker would lose on every informed fill. The bid-ask spread is the maker's protection: it sets the bid below, and the ask above, the expected value conditional on the direction of the incoming order, so the gain from uninformed flow funds the loss to informed flow.

One line before the model: the ask is what the asset is worth given that someone wants to buy it from you; the bid is what it is worth given that someone wants to sell it to you, and a buyer is, on average, slightly informed. Glosten–Milgrom (1985) is the canonical sequential-trade model. Traders arrive one at a time; a fraction are informed (they know the asset's true value $v$ , high or low) and the rest are uninformed (they buy or sell for liquidity reasons, at random). The competitive, risk-neutral, zero-profit maker sets the ask and bid to the two posterior means that flank the prior.

The ask is the value given the next order is a buy; the bid is the value given a sell. Because a buy is more likely to be informed, the ask sits above the prior and the bid below, and that gap is the spread, with no inventory or fee component at all.

a = \mathbb{E}\!\left[v \mid \text{buy}\right], \qquad b = \mathbb{E}\!\left[v \mid \text{sell}\right], \qquad a \gt \mathbb{E}[v] \gt b

That is the deep result: a spread exists even when holding the asset is free. The spread widens with the informed fraction (more toxic flow, bigger protective spread) and with the size of the value innovation (bigger possible surprise, more to lose); it narrows as the maker becomes more confident about value. This is the bridge to PIN: the informed fraction is PIN, the probability of informed trading. Crucially, the quotes update toward the true value as flow arrives: a run of buys drags both bid and ask up, because each buy is weak evidence the asset is worth more. This is price discovery: information held by a few traders is impounded into the public price through their orders, and the market maker is the mechanism that does the impounding.

▸ Show the Glosten–Milgrom update optional

Let value $v \in \{v_H, v_L\}$ with prior $P(v_H) = \theta$ . A fraction $\mu$ of traders are informed: they buy iff $v = v_H$ and sell iff $v = v_L$ ; the uninformed buy or sell with probability $\tfrac12$ each. The conditional buy probabilities are then:

P(\text{buy} \mid v_H) = \mu + \tfrac{1-\mu}{2}, \qquad P(\text{buy} \mid v_L) = \tfrac{1-\mu}{2}

The ask is the posterior mean of $v$ given a buy, with the posterior on $v_H$ from Bayes' rule:

a = \mathbb{E}\!\left[v \mid \text{buy}\right] = v_H\,P(v_H \mid \text{buy}) + v_L\,P(v_L \mid \text{buy}), \qquad P(v_H \mid \text{buy}) = \frac{\theta\,P(\text{buy}\mid v_H)}{P(\text{buy})}

Symmetrically the bid is the posterior mean given a sell. The half-spread is increasing in the informed fraction $\mu$ (i.e. PIN) and in the value gap $(v_H - v_L)$ . After each trade the prior $\theta$ is replaced by the posterior, so successive quotes random-walk toward the realised $v$ : the formal statement that prices are a martingale that converges on value as the flow reveals it.

How order flow moves the price (Kyle 1985)

Kyle's model gives the quantitative version of the same idea: price moves linearly in net order flow. A single informed trader (the "insider") spreads their trades to hide among noise; the market maker, unable to separate them, moves the price by $\lambda$ per unit of signed flow. That $\lambda$ , Kyle's lambda, is the price of information and the inverse of market depth.

One line: the more the price jumps per unit of net buying, the easier it is to read information off the flow, and the thinner, more dangerous, the market is.

The price update is lambda times the net signed order flow. Lambda is small when the book is deep and informed flow is camouflaged by noise, large when the market is thin; its inverse is a clean definition of market depth.

\Delta p \;=\; \lambda \cdot (\text{net signed order flow}), \qquad \text{depth} = \frac{1}{\lambda}

Where Glosten–Milgrom is sequential and discrete (one trade at a time, a bid-ask spread), Kyle is batched and continuous (a single auction, a linear price-impact coefficient). They are two faces of one phenomenon: informed order flow is impounded into price, and the rate at which it is impounded (the spread in GM, the $\lambda$ in Kyle) is the market maker's defence and the informed trader's cost. This $\lambda$ is the same object that reappears across the atlas as the permanent, informational component of market impact: the part of your own footprint that sticks because the market reads your trading as information. Market making and execution are the two sides of Kyle's coin: the maker earns $\lambda$ from those it picks off; the executor pays $\lambda$ as permanent impact.

▸ Show Kyle's λ optional

In the one-period Kyle (1985) model, an informed trader knows $v \sim N(p_0, \Sigma_0)$ and submits demand $x = \beta(v - p_0)$ ; noise traders submit $u \sim N(0, \sigma_u^{2})$ ; the market maker sees only the total flow $y = x + u$ and, competitively, sets a price linear in it:

p = p_0 + \lambda\,y, \qquad y = x + u

Solving the fixed point (the maker's pricing rule must be consistent with the informed trader's optimal $\beta$ ) gives:

\lambda = \frac{1}{2}\sqrt{\frac{\Sigma_0}{\sigma_u^{2}}}, \qquad \beta = \sqrt{\frac{\sigma_u^{2}}{\Sigma_0}}

Depth $1/\lambda$ rises with noise-trader volume $\sigma_u$ (more cover for the informed) and falls with value uncertainty $\Sigma_0$ . The informed trader chooses $\beta$ to maximise expected profit $\mathbb{E}[x(v - p)]$ , trading off aggression against the price they move, the foundation of optimal-execution-style camouflage.

The modern edge: better fair value, better toxicity

Because adverse selection is permanent and unavoidable, the durable edge in 2026 is estimation. A maker that fair-values more accurately (the microprice) and reads toxicity faster (order-flow imbalance, VPIN-style features) is picked off less and can quote tighter than rivals, winning more benign flow while dodging more toxic flow. Modelling the information, not eliminating it, is the game.

The family maps to four concept pages, each a sharper tool for the same job. Adverse selection is the risk itself: the Glosten–Milgrom mechanics, how informed flow sets your minimum survivable spread, and the difference between toxic and benign flow (hosts IX-ADVSEL). Order-flow imbalance (OFI) is the cleanest short-horizon signal: net pressure at the top of the book predicts the next move roughly linearly (Cont–Kukanov–Stoikov 2014), the empirical Kyle's- $\lambda$ you can actually compute (hosts IX-OFI). PIN & VPIN is toxicity measurement: the probability of informed trading (Easley–O'Hara) and its volume-clocked descendant VPIN (Easley–López de Prado–O'Hara 2012), with an honest account of how contested VPIN's predictiveness is (hosts IX-VPIN). The microprice is the flagship fair-value estimator: Stoikov's imbalance-adjusted fair value, better than mid or weighted-mid, and the price a modern maker actually quotes around (hosts IX-MICROPRICE).

The honest 2026 line: the crude version of each of these is partly commoditised. Anyone can compute OFI from a feed; VPIN is publishable; the microprice formula is in a paper. The surviving edge is in the quality of the estimate (clean data, low latency, a well-fitted imbalance adjustment, conditioning on regime) and in the venues where the field is younger (crypto, prediction markets). What AI changes: richer features (book state, cross-asset flow, machine-readable news) feed a learned fair value and toxicity model that beats a flat textbook coefficient (see what AI changes).

Inventory vs information: how the two stories combine

A real market maker faces both risks at once. Inventory risk (Market Making I) is price risk on the position you accumulate; information risk (Market Making II) is the risk that the flow building that position is toxic. The first is solved by skewing quotes around a reservation price; the second by widening and re-centring around a better fair value. Both adjust the same two numbers: where you quote, and how wide.

The Avellaneda–Stoikov reservation price (A–S) leans your quotes away from inventory; the microprice leans your centre toward where the flow says value is. In a production maker your quote centre is something like microprice, then skewed for inventory, and your half-spread is the A–S inventory term plus an adverse-selection term: the two pillars literally add up in the quoting equation.

Quote centre = a better fair value, leaned for inventory; half-spread = the inventory-risk charge plus the adverse-selection charge. The two pillars add up.

\text{centre} = \underbrace{\text{microprice}}_{\text{MM II}} - \underbrace{q\,\gamma\,\sigma^{2}(T-t)}_{\text{MM I skew}}, \qquad \delta = \underbrace{\tfrac12\gamma\sigma^{2}(T-t)}_{\text{inventory}} + \underbrace{\text{AdvSel}}_{\text{information}}

The spread you finally quote has to clear three things at once: the inventory cost of holding the position, the adverse-selection cost from this family, and your transaction costs and fees. Drop any one and you are mispricing risk. The page where these meet most explicitly is spread vs adverse selection in the Market Making I guides.

Worked example

A schematic information-based quote on a synthetic instrument, illustrative as of 2026. An asset is worth either $v_H = 101$ or $v_L = 99$ , equally likely, so the unconditional value is $\mathbb{E}[v] = 100$ . A fraction $\mu = 30\%$ of orders are informed (buy iff $v = 101$ , sell iff $v = 99$ ); the other 70% are uninformed, buying or selling 50/50.

Probability a buy comes from each state. $P(\text{buy} \mid v_H) = 0.30 + 0.70\cdot 0.5 = 0.65$ and $P(\text{buy} \mid v_L) = 0.70\cdot 0.5 = 0.35$ . So an observed buy is nearly twice as likely under the high value as the low. By Bayes, $P(v_H \mid \text{buy}) = 0.65 / (0.65 + 0.35) = 0.65$ .

The ask and bid are the two posterior means; their gap is a 0.60 spread that exists with no inventory and no fees, purely to recoup adverse selection.

a = 0.65\cdot 101 + 0.35\cdot 99 = 100.30, \quad b = 99.70, \quad a - b = 0.60

Raise the informed fraction to $\mu = 60\%$ and the same arithmetic gives a spread of about 1.20: the spread doubles because the flow got twice as toxic. This is the relationship IX-ADVSEL lets the reader drive on the adverse-selection page.

Price discovery in action. Suppose the true value is $v_H = 101$ and you observe three buys in a row. Each buy is evidence for $v_H$ , so your posterior on $v_H$ climbs and both quotes ratchet upward toward 101.

Three buys in a row drag the posterior on the high value from 0.65 to 0.94; by the third trade the ask is near 100.9 and the bid near 100.3, and the market has discovered the value from the flow.

P(v_H \mid \text{buys}): \; 0.65 \to 0.79 \to 0.88 \to 0.94

By the third trade your ask is about 100.9 and your bid about 100.3, and the market has discovered the value from the flow, with the informed trader's edge shrunk to almost nothing. That convergence is the whole social function of a market maker.

The desk's P&L decomposition. Over many such trades the maker collects the half-spread on every fill (+) and pays the value-move on every informed fill (−). At the competitive spread these net to roughly zero (the zero-profit condition); a real maker quotes a touch wider than break-even and keeps the difference; its edge is the gap between the true informed fraction and its estimate of it. Estimate toxicity better than the field and you can quote tighter, win more benign flow, and still cover the toxic flow you do take.

The numbers are synthetic and illustrative; informed fractions, value gaps and spreads must be estimated per instrument and dated (as of 2026). The live widgets (IX-ADVSEL, IX-OFI, IX-VPIN, IX-MICROPRICE) on the concept pages let the reader drive each of these relationships. Educational only, not investment advice; no P&L is promised.

Where this fits

↑ Up · building block of Trading strategies ↔ Across · composes with Market making ↔ Across · composes with Statistical arbitrage → Apply · makes money in Crypto market making → Apply · makes money in Prediction markets (Polymarket) ⤓ Build / Buy · tool needed Datasets & tools