Systems & building

Colocation & FPGA

≈commoditised

Reviewed 4 June 2026. As of 2026: widely known and implemented; the edge is in execution, not the idea.

Rent a rack next to the matching engine and put your logic in silicon. The latency budget (every nanosecond from wire to order) decides whether you win the race. In 2026, a capital-intensive arms race won by a few.

See it move

The latency budgettoggle the stackIX-LATENCYBUDGET

Total tick-to-trade18.0 µs

Competitive?mid-tier

Wire / network · 3µsNIC + kernel · 2µsFeed handler · 4µsStrategy logic · 6µsOrder out · 3µs

Colocation FPGA

What to notice. Off a retail path the wire alone is hundreds of microseconds, so you've lost the race before your logic runs. Colocation collapses the network term; an FPGA collapses the compute term. Each is a capital decision, and together they define who can play the speed game.

What is colocation, and what does a cross-connect actually buy you?

Colocation means renting rack space for your server inside the exchange's own datacentre, beside the matching engine. A cross-connect is the physical cable from your rack to the exchange's network. Together they delete the propagation delay of distance (the milliseconds light takes to travel to a remote location), leaving you only the in-building microseconds. It is the single biggest, simplest latency win.

The intuition is the speed of light itself. In fibre, light covers roughly a kilometre every 5 µs, so a server 100 km from the exchange carries about 0.5 ms (500 µs) of round-trip delay you simply cannot beat: light is light, and no software trick competes with deleting the distance. Move into the building and that ~0.5 ms collapses toward zero. You can feel this in the budget builder above: drag the "distance to venue" slider from cloud (hundreds of km) to colo (≈0) and watch the propagation segment dominate the bar, then vanish.

Propagation delay is distance divided by the speed of light in the medium. In fibre that is about 5 µs per kilometre one way, so distance is a fixed, unbeatable tax that colocation removes outright.

t_{\text{prop}} = \frac{d}{v}, \qquad v_{\text{fibre}} \approx 2\times10^{8}\,\text{m/s} \;\Rightarrow\; t_{\text{prop}} \approx 5\,\mu\text{s per km}

Exchanges sell colocation as a regulated, equal-access product, and often go further, equalising cable lengths so every colo customer's cross-connect is the same physical distance to the matching engine, meaning being one rack closer confers no edge. The level playing field is the product; you pay to be on it. What it costs is a recurring rack/power/cross-connect fee plus the feed and order-entry connectivity charges: material fixed cost, the kind that drives the thin-margin economics of the business. The honest framing: colocation is table stakes for any latency-sensitive strategy and a pure purchase, with no skill in it, only spend. It is the clearest example of infrastructure being the price of entry, not the edge.

Why microwave and mmWave between datacentres: the speed-of-light geography

Between two distant datacentres (say the equity venues near New York and the futures venues near Chicago) light travels about 50% faster through air than through glass fibre, and a microwave link flies a straighter, shorter great-circle path. So firms build microwave and mmWave networks to shave milliseconds off inter-venue routes: the geography of the planet becomes a latency asset.

The physics is two wins at once. Fibre is glass, and light in glass moves at roughly 200,000 km/s, against about 300,000 km/s in air; worse, fibre follows roads and rights-of-way rather than straight lines. A microwave relay chain goes nearly straight, through air, so it beats fibre on both medium and path. On the Chicago–New Jersey route this is a difference of hundreds of microseconds: decisive for cross-venue speed.

Microwave wins on both the medium (light is faster in air than in glass) and the path (a near-straight great circle versus fibre's road-following detour). Both factors cut the propagation term.

\frac{v_{\text{air}}}{v_{\text{fibre}}} \approx \frac{3.0\times10^{8}}{2.0\times10^{8}} = 1.5, \qquad d_{\text{microwave}} \lt d_{\text{fibre}}

The trade-offs are real: microwave carries far less bandwidth than fibre and degrades in heavy rain and fog, so it is reserved for the latency-critical triggers (a small, urgent signal) while bulk data still goes by fibre; mmWave (millimetre-wave) and laser/free-space-optical links push this further on shorter hops. This is a true arms race and largely closed to newcomers: the best routes are owned, the tower rights are scarce and expensive, and the firms (and specialist network providers) that hold them defend them. You do not casually enter the microwave game; you rent capacity on someone else's network, if you can get it at all. Conceptually it is the purest illustration that HFT speed is bounded by physics, and that the remaining frontier is about buying scarce geography, not writing better code; it is what powers cross-venue latency arbitrage.

What is an FPGA, and when is it worth it over software?

An FPGA (field-programmable gate array) is a chip you wire into custom hardware logic. Putting the hot-path logic (decode, book, trigger) into an FPGA runs it in parallel silicon, taking tick-to-trade from the microseconds of software into the tens-to-hundreds of nanoseconds, "wire-to-wire". It is worth it only when the strategy genuinely needs that tier, because the development cost is brutal.

The intuition: software runs instructions one after another on a general CPU; an FPGA is the circuit, doing the work in dedicated parallel gates with no instruction-fetch overhead. For a fixed, simple, latency-critical task (recognise a pattern in the feed and fire) that is enormously faster and far more deterministic (very low jitter) than any CPU. The "software vs FPGA" toggle in the budget above makes this visible: the strategy-logic segment collapses from tens of µs to single-digit and below, while the cost meter jumps, with diminishing returns made tangible.

The trade lifecycle is why it is painful. FPGA development is in a hardware description language (Verilog/VHDL, or high-level synthesis), with slow compile and synthesis cycles, hard debugging, and scarce, expensive specialists. So the workflow is: prototype and validate the strategy in software and backtest, prove it needs the speed, then port only the tightest, most stable inner loop to the FPGA. You do not iterate research on an FPGA; you commit a finished, simple thing to it. The hybrid norm follows: most "FPGA" systems are hybrids, where the FPGA handles the wire-to-wire fast path (parse, risk-check, fire a pre-staged order) while a CPU handles the richer, slower strategy state. The FPGA does the reflex; the software does the thinking.

2026 status: FPGA NICs and tick-to-trade frameworks are now products (Solarflare/Xilinx-lineage cards, vendor IP) so the capability is buyable, but the integration, the venue relationships and the specialist team are not cheap, and the absolute frontier stays with firms who do it in-house. It is the speed tier's ceiling, and it is largely closed to a solo builder.

What is a tick-to-trade latency budget, and how do you build one?

A latency budget is the additive breakdown of tick-to-trade (the total time from inbound market-data packet to outbound order) into its stages: propagation, NIC/kernel, decode, strategy logic, risk, gateway. You build it by listing each stage's contribution in nanoseconds, then attacking the largest segment first. It tells you where the time goes and what each speed purchase actually buys.

The intuition is that latency is a sum, not a single thing. If propagation is 500 µs and your software is 5 µs, optimising the software is pointless until you colocate; the budget makes that obvious by laying the segments side by side, so you optimise the biggest one. The canonical stages are each independently attackable: propagation (distance → delete with colocation); NIC + kernel (→ kernel bypass); decode (→ binary protocol); strategy logic (→ software hot path or FPGA); pre-trade risk (in-line); and the order gateway (binary order entry).

Total tick-to-trade is the sum of independently-attackable stages. The marginal speed purchase is only worth making on the largest remaining segment, and only if the strategy's edge justifies the cost of shrinking it.

T = t_{\text{prop}} + t_{\text{NIC}} + t_{\text{decode}} + t_{\text{strategy}} + t_{\text{risk}} + t_{\text{gateway}}, \qquad \text{attack } \arg\max_i t_i \text{ first}

The method is a loop: measure (see measuring latency), find the biggest segment, decide whether the strategy's edge justifies the cost of shrinking it, repeat. The budget is also a cost-justification tool: it turns "should we buy an FPGA?" into "what is the nanosecond saving worth to this strategy?", and it is the discipline that stops a newcomer burning capital chasing speed a strategy does not need. That is exactly what the budget builder above makes tangible: the stacked bar, the target line, the cost meter, and the three presets ("Retail/cloud", "Colo + kernel-bypass", "FPGA tick-to-trade").

So can a newcomer actually compete on speed in 2026?

On the absolute frontier (nanosecond wire-to-wire, microwave routes, in-house FPGA), no. That tier needs capital, hardware specialists, scarce geography and venue relationships, and is defended by a few firms grinding against each other. What a newcomer can do is reach parity-grade microsecond infrastructure by purchase and compete on something other than raw speed.

The honest hierarchy, as of 2026, is three tiers. The nanosecond frontier (pure latency arbitrage against the fastest players) is effectively closed; do not plan a business on out-speeding the incumbents there. Microsecond colocated software/FPGA (faster market making, queue-sensitive strategies on lit venues) is contestable if you can afford colocation and have the engineering, but you are buying parity, not an edge: the edge must come from the strategy. Millisecond and slower (stat-arb, slower market making, cross-venue, and especially crypto and prediction markets) is genuinely open to a capable solo builder, because there speed is not the binding constraint; research quality, data and execution are.

The strategic conclusion this guide keeps returning to: choose a game where speed is not the edge. The 2026 opening is the microsecond-and-slower band, where the same microstructure maths holds but the infrastructure arms race does not gate you. Speed is a cost to control, not a moat to win. See is HFT still profitable in 2026.

Worked example

Take the classic speed example, the Chicago futures ↔ New Jersey equities cross-venue case, as a back-of-envelope latency budget, dated to 2026 and reproducible in the builder above (illustrative; real route latencies and colo fees are commercial and change).

The distance is roughly 1,200 km great-circle. Over fibre (light in glass about 5 µs/km, following a non-straight route nearer 1,300 km) that is about 6.5 ms one way; over microwave (light in air about 3.33 µs/km, a near-straight 1,200 km) about 4 ms one way. The microwave advantage is on the order of 2+ ms per leg: colossal in HFT terms, and the entire reason the microwave networks were built.

On the long inter-venue leg, the microwave route saves on the order of two milliseconds each way, a gap no amount of software cleverness can close, which is why the route itself is the contested, owned asset.

t_{\text{fibre}} \approx 1300\,\text{km} \times 5\,\tfrac{\mu\text{s}}{\text{km}} \approx 6.5\,\text{ms}, \qquad t_{\mu\text{wave}} \approx 1200\,\text{km} \times 3.33\,\tfrac{\mu\text{s}}{\text{km}} \approx 4\,\text{ms}

Inside the datacentre, colocated, propagation falls to about zero and the budget is dominated by your own stack: a kernel-bypass software path is about 2–7 µs tick-to-trade (from the low-latency stack), an FPGA wire-to-wire path about tens-to-hundreds of nanoseconds. The decision the numbers force is stark. If your strategy depends on reacting to the Chicago move before others act on it in New Jersey, you need the microwave route (scarce, owned) and an FPGA to act on arrival, both frontier-tier, both largely closed. If your strategy works at millisecond reaction inside one venue, you need none of it: colocation optional, FPGA pointless. The budget tells you which world you are in before you spend a penny. Numbers are schematic and dated to 2026; verify against current provider figures and the venue's colocation spec. Educational only, not investment advice.

Where this fits

↑ Up · building block of Systems & building ↔ Across · composes with The low-latency stack ↔ Across · composes with Latency arbitrage → Apply · makes money in Latency arbitrage → Apply · makes money in Equities & futures ⤓ Build / Buy · tool needed Datasets & tools

Common questions

Do I need colocation, or can I trade from a laptop?

It depends on the strategy. Pure latency races need colocation and often FPGAs; a laptop cannot compete. But slower microstructure strategies (inventory-aware market making on a single venue, statistical arbitrage, mid-frequency order-flow signals) are runnable from a laptop, especially in 24/7 crypto and prediction markets where the latency arms race is younger.