High-frequency data

Irregular time & point processes

structural
Reviewed 4 June 2026. As of 2026: a permanent feature of the market, not an edge that decays.

Trades arrive in clustered bursts, not on the clock. Engle–Russell’s ACD model (1998) treats the durations between events as the object of study, the maths a mathematician will recognise immediately.

See it move

Irregular arrival timesdrag clusteringIX-DURATION
event time → (each tick = a trade)
Clustering (ACD)70%

What to notice. Trades don't arrive on the clock. They bunch into bursts and then go quiet. Engle–Russell's ACD models the durations between events directly, which is why fixed-interval sampling throws away exactly the structure you care about.

Why is high-frequency data irregularly spaced?

A tape has one record per event, and events (trades, quote updates, book changes) arrive whenever participants act, not on a metronome. A daily bar series has a row every day whether or not anything happened; a tape has a row every time something happens, and things happen in bursts: at the open, on news, when a large order works through the book. The clock that governs a tape is the event clock, not the wall clock. Activity clusters because order flow is self-exciting: a market order consumes depth, market makers re-quote, other participants react to the new book, and a single arrival cascades into a flurry. This is the same "bursts beget bursts" structure as volatility clustering, seen in the timing rather than the magnitude of moves.

The data fact that follows: inter-event durations are over-dispersed (their standard deviation exceeds their mean), so a constant-rate Poisson process, which would give exponential, equi-dispersed durations, is the wrong model. The clustering is structural, not noise. Open the explorer above on the bursty preset and the duration histogram shows a fat right tail against the exponential reference.

Durations are over-dispersed: the standard deviation exceeds the mean, so the dispersion index exceeds one, and a constant-rate Poisson process (index exactly one) is rejected. Clustering is the structure, not noise.
dispersion=sd(x)E[x]  >  1(Poisson: =1)\text{dispersion} = \frac{\mathrm{sd}(x)}{\mathbb{E}[x]} \;\gt\; 1 \qquad (\text{Poisson: } = 1)

What is a point process, and why model the durations?

A point process is the natural language for events scattered irregularly on a timeline: it models when things happen rather than what value a series takes at fixed times. Instead of asking "what is the price at 14:00:00, 14:00:01, …", which forces a clock the data does not have, you ask "how long until the next event, given the recent history?" That reframing keeps the timing, which is exactly where the microstructure information lives. There are two equivalent views. The intensity λ(t)\lambda(t) is the expected number of events per unit time given the past: high after a burst, low in a lull. The duration xi=titi1x_i = t_i - t_{i-1} is the gap between events i1i-1 and ii; short durations cluster around information and activity, long durations mark quiet.

The spacing is itself information. Short durations predict short durations (clustering); durations correlate with volatility, spread and informed trading. Easley–O'Hara's microstructure tells us that the rate of trading itself signals information, so the duration is a feature, not a nuisance to be averaged away. The intensity curve overlaid on the raster above annotates a short-duration cluster as "information arriving".

Two equivalent descriptions of the same point process: the instantaneous intensity (expected events per unit time given the past) and the durations between consecutive events. Either one carries the timing the wall clock discards.
λ(t)=limΔ0E[N(t+Δ)N(t)Ft]Δ,xi=titi1\lambda(t) = \lim_{\Delta \to 0}\frac{\mathbb{E}[\,N(t+\Delta) - N(t)\mid \mathcal{F}_t\,]}{\Delta}, \qquad x_i = t_i - t_{i-1}

What is the ACD model (Engle–Russell 1998)?

The Autoregressive Conditional Duration model (Engle–Russell 1998) is GARCH for the time between events. Where GARCH says "today's expected variance depends on recent variances and shocks", ACD says "the expected wait until the next trade depends on recent waits", so busy periods stay busy and quiet periods stay quiet. It models the expected duration ψi=E[xipast]\psi_i = \mathbb{E}[x_i \mid \text{past}] as a weighted memory of recent durations and recent expectations, so durations cluster just as GARCH makes volatility cluster, and it is the canonical model for irregularly-spaced financial data. In the standard ACD(1,1), α\alpha weights the last realised duration and β\beta the persistence of the expectation; α+β\alpha + \beta near 1 means highly persistent clustering. The standardised durations εi=xi/ψi\varepsilon_i = x_i/\psi_i are i.i.d. positive with mean 1, typically modelled with an exponential or Weibull baseline.

ACD lets you forecast trade intensity, condition volatility and spread models on the activity rate, and build event-time clocks. It opened the whole field of econometrics on irregularly-spaced data. Its close cousin, the Hawkes process, models the intensity directly with self-excitation: each past event bumps the intensity, which then decays. Multivariate Hawkes capture cross-asset and cross-venue contagion (a trade here raises the arrival rate there) and are heavily used in modern microstructure. Raise the "ACD persistence" slider above toward 1 and watch the clustering strengthen.

ACD(1,1) is GARCH with durations in place of squared returns: the expected gap is a weighted memory of recent gaps. A Hawkes process states the same self-excitation directly on the intensity, each event bumping λ\lambda by a decaying kernel.
ψi=ω+αxi1+βψi1,λ(t)=μ+ti<tαeβ(tti)\psi_i = \omega + \alpha\,x_{i-1} + \beta\,\psi_{i-1}, \qquad \lambda(t) = \mu + \sum_{t_i \lt t} \alpha\,e^{-\beta(t - t_i)}
Show the derivation optional

Write each duration as its conditional expectation times an i.i.d. innovation, exactly as GARCH writes a return as its conditional standard deviation times a unit-variance shock.

xi=ψiεi,    E[εi]=1,  εipastri=σizi,    Var(zi)=1x_i = \psi_i\,\varepsilon_i,\;\; \mathbb{E}[\varepsilon_i]=1,\; \varepsilon_i \perp \text{past} \quad\Longleftrightarrow\quad r_i = \sigma_i z_i,\;\; \mathrm{Var}(z_i)=1

Then E[xipast]=ψi\mathbb{E}[x_i \mid \text{past}] = \psi_i, and the ACD(1,1) recursion is the GARCH(1,1) recursion with durations playing the role of squared returns.

ψi=ω+αxi1+βψi1σi2=ω+αri12+βσi12\psi_i = \omega + \alpha\,x_{i-1} + \beta\,\psi_{i-1} \quad\Longleftrightarrow\quad \sigma_i^2 = \omega + \alpha\,r_{i-1}^2 + \beta\,\sigma_{i-1}^2

Stationarity requires α+β<1\alpha + \beta \lt 1, and the unconditional mean duration is ω/(1αβ)\omega/(1-\alpha-\beta). Estimation is by quasi-maximum likelihood. Exponential ACD gives a particularly clean QML, the exponential being to ACD what the Gaussian is to GARCH.

E[x]=ω1αβ,ε^i=xiψ^i  ?  i.i.d., mean 1\mathbb{E}[x] = \frac{\omega}{1 - \alpha - \beta}, \qquad \hat{\varepsilon}_i = \frac{x_i}{\hat{\psi}_i} \;\overset{?}{\sim}\; \text{i.i.d., mean } 1

The standardised durations ε^i\hat{\varepsilon}_i should be i.i.d. with mean 1 if the model fits; the diagnostic mirrors GARCH residual checks. Richer baselines (Weibull, generalised gamma) and log-ACD variants relax the linearity and positivity constraints.

Calendar time vs event time vs business time

Calendar time samples on the wall clock (every minute). Event time advances one step per event; business (volume) time advances per unit of volume traded. If information arrives in bursts, then a fixed clock interval contains wildly different amounts of information (a quiet minute versus a news minute), whereas a fixed event or volume interval contains roughly constant information, so statistics computed on it behave far better. Event-time (tick) bars give one bar per N events; volume bars one bar per N shares or contracts; dollar/value bars one bar per N units of traded value (robust to price level). Each is an information clock rather than a wall clock.

The payoff is a direct link to the stylised facts: returns sampled in business/volume time are closer to i.i.d. and less fat-tailed than calendar-time returns. This is the Ané–Geman (2000) result: the fat tails of calendar-time returns are partly a time-deformation effect, and under the right stochastic clock returns look more Gaussian. So some of the non-normality in fat tails is the irregular clock, not the price process. The mainstream application: VPIN buckets the tape into equal-volume buckets precisely to sample in business time, and volume/dollar bars are standard practice for exactly this reason. Hit the "calendar vs event time" toggle above and the burstiness flattens.

Re-clocking from calendar time to a business clock τ\tau (cumulative volume or activity) deforms time so each interval carries roughly constant information, and the deformed-time returns are closer to i.i.d. and far less fat-tailed (Ané–Geman 2000).
rcalendar fat-tailed    tτ(t)    rbusiness closer to Gaussian, i.i.d.r^{\text{calendar}}\ \text{fat-tailed} \;\xrightarrow{\;t \,\mapsto\, \tau(t)\;}\; r^{\text{business}}\ \text{closer to Gaussian, i.i.d.}

What does fixed-bar resampling throw away?

Resampling a tape to fixed bars (OHLCV per minute) discards the timing and count of events within the bar, exactly the clustering that carries microstructure information. A one-minute bar reports open, high, low, close and volume but not whether those 1,000 trades arrived in one two-second burst or spread evenly across the minute. Those are completely different microstructure states with the same bar, and the difference (the burst) is the predictive part. Concretely you lose four things: the duration/intensity process (forecastable, tradable); the within-bar sequence needed for trade-sign rules and order-flow features; correct volatility at sub-bar scale (you have pre-averaged away the bounce and the clustering); and the ability to align events across instruments at native resolution (lead-lag).

Fixed bars are fine for long-horizon, low-frequency studies where you genuinely do not need sub-bar structure; they are a reasonable, convenient approximation. The error is using them unthinkingly at high frequency, where the discarded structure is the whole point. The discipline: choose the clock deliberately. If the timing matters (and at HFT horizons it almost always does) model the point process or sample in event/volume time, do not flatten to calendar bars by reflex.

Worked example

A synthetic Hawkes/ACD tape, as of 2026. Reproduce it in the explorer above. Generate events with a Hawkes intensity λ(t)=μ+αeβ(tti)\lambda(t) = \mu + \sum \alpha\,e^{-\beta(t - t_i)}, base μ=2\mu = 2 events/sec, self-excitation α=1.4\alpha = 1.4, decay β=3\beta = 3. The branching ratio is n=α/β0.47n = \alpha/\beta \approx 0.47: each event spawns about 0.47 offspring on average, stable but clustered. Events then arrive in visible cascades: a single arrival triggers a flurry, the intensity spikes from 2 toward 6-plus per second and decays back, and durations range from about 10 ms inside a burst to about 1 s in a lull.

The branching ratio n=α/βn=\alpha/\beta is the expected offspring per event: below 1 the process is stable but clustered, and at 0.47\approx 0.47 nearly half of all events are self-excited descendants: the cascades you see on the raster.
n=αβ=1.430.47  <  1(stationary, self-exciting)n = \frac{\alpha}{\beta} = \frac{1.4}{3} \approx 0.47 \;\lt\; 1 \quad (\text{stationary, self-exciting})

Measure the durations and the standard deviation exceeds the mean: over-dispersion index above 1, so a constant-rate Poisson (index 1) is rejected by eye and by test. Fit an ACD(1,1) and it recovers high persistence (α+β\alpha + \beta near 0.9-plus): short durations forecast short durations, and the standardised durations ε^i=xi/ψ^i\hat{\varepsilon}_i = x_i/\hat{\psi}_i come out roughly i.i.d. mean 1, confirming the fit. Now re-clock to event time (one unit per event) and the calendar-time burstiness disappears: durations are constant by construction, and any return series built on this clock is far closer to i.i.d. than its calendar-time counterpart (the time-deformation effect again). Crank α\alpha and the cascades intensify; this is the trade-burst structure of real tapes, and exactly what a one-minute bar cannot represent. Reverify branching ratios and durations against real tapes before relying on them; these synthetic figures make the mechanism legible. A native-resolution, event-time tape (not pre-resampled bars) is what you need to model the point process honestly (datasets and tools).

Where this fits