High-frequency data

Fat tails

structural
Reviewed 4 June 2026. As of 2026: a permanent feature of the market, not an edge that decays.

High-frequency returns are not normal. The tails are far heavier than a Gaussian predicts, so the six-sigma event happens monthly, not once a millennium. Pretend otherwise and your risk model lies to you.

See it move

Empirical vs normal returnsdrag the tailsIX-FATTAILS
Tail weightheavy
Tail-event count365
vs normalfar more
dashed = normal · red = the tails
Tail heaviness (↓ = fatter)3

What to notice. The empirical histogram has far more mass in the red tails than the dashed normal curve allows. Under a Gaussian a six-sigma day is a once-in-millennia event; in real high-frequency returns it happens routinely. Risk models built on normality understate the danger.

Are financial returns normally distributed?

No. This is one of the most robust facts in all of finance, first shown by Mandelbrot (1963) and Fama (1965) and unbroken by any regime since. Return distributions are leptokurtic: a higher, narrower peak and far fatter tails than a Gaussian with the same variance. Extreme moves (crashes, gaps, flash events) arrive orders of magnitude more often than normality predicts, and the discrepancy gets worse the faster you sample.

Start with the intuition, because the maths only confirms it. A bell curve says enormous moves essentially never happen; markets disagree and deliver "impossible" days on a regular schedule. The Gaussian is not slightly wrong in the tails. It is wrong by orders of magnitude, and the tails are exactly where the money and the ruin live. A model that is accurate in the body and catastrophically optimistic in the tail is worse than no model, because it hands you false confidence precisely where you cannot afford it.

Empirical returns share a set of universal stylised facts (Cont 2001) across assets, venues and decades: heavy tails / excess kurtosis; volatility clustering (big moves follow big moves); aggregational Gaussianity (returns look more normal at longer horizons); the near-absence of linear autocorrelation in returns alongside strong autocorrelation in their absolute and squared values; the leverage effect (volatility rises more after down moves); and gain/loss asymmetry. This page treats the first three head-on.

What "fat" means precisely: the tail decays as a power law rather than the Gaussian's rapidly-vanishing exponential, so it holds dramatically more mass far out. Empirically the tail index α\alpha sits around 3–5 for many liquid assets, the "inverse cubic law" (Gopikrishnan et al. 1999).
P(r>x)    xα(power law)vs.P(r>x)    ex2/2(Gaussian)P(|r| \gt x) \;\sim\; x^{-\alpha} \quad\text{(power law)} \qquad \text{vs.} \qquad P(|r| \gt x) \;\sim\; e^{-x^2/2} \quad\text{(Gaussian)}

What is kurtosis, and why does it matter?

Kurtosis measures tail heaviness: the fourth standardised moment, E[(rμ)4]/σ4E[(r-\mu)^4]/\sigma^4. A normal distribution has kurtosis 3, so excess kurtosis (kurtosis minus 3) is zero for a Gaussian and positive for fat-tailed returns, often in the high single or double digits at high frequency. Positive excess kurtosis is the single-number fingerprint of fat tails.

The intuition is in the exponent. Kurtosis is driven by the fourth power of the deviation, so it is dominated by the rare, large moves and barely notices the ordinary ones. A high kurtosis is the distribution saying "most of my variance comes from a handful of enormous observations", which is exactly the danger profile of financial returns, where a few crisis days carry the risk the calm months hide.

Excess kurtosis: how much of the variance lives in the extreme moves. Gaussian is 0; daily equity returns sit at a few; tick or minute returns are often in the tens, and a handful of crisis days can lift the whole-sample figure enormously.
κ  =  E ⁣[(rμ)4]σ4    3\kappa \;=\; \frac{E\!\left[(r-\mu)^4\right]}{\sigma^4} \;-\; 3

Report the number, but do not trust its precision. Because kurtosis weights the largest observations so heavily, the sample estimate is dominated by (and wildly sensitive to) the few biggest moves; it is noisy and converges slowly. That fragility is itself a tell that the tails are heavy: under a true power law with α4\alpha \le 4 the theoretical kurtosis is infinite, so the sample value simply grows as you collect more data rather than settling down. When a moment refuses to converge, the right response is not a bigger sample but a better tool: prefer the tail index α\alpha to a single kurtosis estimate.

Why does the QQ plot reveal fat tails?

A QQ (quantile–quantile) plot maps the empirical quantiles of your data against the theoretical quantiles of a Normal. If the data were Gaussian, the points would lie on a straight line. Fat tails make the points peel away from the line at both ends: the extreme observations are far larger than the Normal predicts. It is the clearest visual diagnostic of non-normality, and the signature image of this page.

Why does the middle behave and the ends betray? The centre of a return distribution is roughly bell-shaped, so the central points sit obediently on the line. It is the tails that give the Gaussian away: the most extreme empirical points are much more extreme than the matching Normal quantiles, so the curve bends sharply upward on the right and downward on the left. A histogram cannot show this nearly as well (its tail bins are almost empty and hard to read), whereas the QQ plot puts the tail observations directly on the axes, where the deviation is impossible to miss.

The QQ plot also tells you what to reach for. Overlay a Student-t reference with low degrees of freedom (ν35\nu \approx 3\text{–}5) and it hugs the empirical tails exactly where the Normal fails: a tractable, practical fat-tailed model that the diagnostic makes you want. The picture does double duty: it convicts the Gaussian and nominates its replacement.

What is aggregational Gaussianity?

Aggregational Gaussianity is the stylised fact that returns look more normal at longer horizons. Tick and minute returns are violently fat-tailed; daily returns less so; monthly returns are close to Gaussian. Summing many short-horizon returns pulls the distribution toward normality (a central-limit effect), but it does so slowly, and never completely while volatility clusters.

The central limit theorem says sums of many independent shocks tend to Gaussian. A monthly return is a sum of many short-horizon returns, so it looks more normal. But the convergence is slow and incomplete, because the short-horizon returns are emphatically not independent: volatility clustering and fat individual shocks both slow the march to normality. The CLT is a true statement about the limit; markets just live a long way from it.

For HFT this carries a sharp and unwelcome consequence. It is precisely the high-frequency regime, where HFT operates, that is most non-normal. The faster you trade, the fatter your tails and the more dangerous a Gaussian assumption becomes. You cannot wave the problem away with "it's normal at the horizons that matter to me", because your horizon is where it is worst. Watch this run in reverse on the explorer above: drag from daily to tick and the QQ tails bend harder while the kurtosis climbs.

Why do volatility clustering and fat tails go together?

Returns themselves are nearly uncorrelated, but their magnitudes are strongly autocorrelated: calm follows calm, turbulence follows turbulence. This volatility clustering is half the reason for fat tails: a fixed-σ\sigma Gaussian cannot reproduce the alternation of quiet and violent regimes, but a mixture of normals with time-varying variance can, and that mixture is fat-tailed.

Here is the mechanism. Draw returns from a Normal whose variance itself changes over time (low on some days, high on others) and the unconditional distribution, pooling all days together, is a mixture of bell curves of different widths. That mixture has a sharper peak and fatter tails than any single bell curve. So time-varying volatility manufactures unconditional fat tails even if each individual day were conditionally Gaussian. Fat tails are not only about the size of shocks; they are partly about the changing scale that produces them.

This is exactly what GARCH (Bollerslev 1986) models: today's conditional variance depends on recent squared returns, so volatility clusters and the unconditional distribution comes out leptokurtic. The leverage effect (volatility rising more after down moves) is captured by asymmetric variants (EGARCH, GJR-GARCH).
σt2  =  ω  +  αrt12  +  βσt12\sigma_t^2 \;=\; \omega \;+\; \alpha\,r_{t-1}^2 \;+\; \beta\,\sigma_{t-1}^2

There are two sources of fat tails, not one, and conflating them mis-specifies the risk model. Some fat-tailedness is conditional (volatility clustering, modelled by GARCH); some is residual, fat tails that remain in the innovations even after de-volatilising, which is why practitioners fit GARCH with Student-t innovations rather than Gaussian ones. Both matter. This stylised fact also links straight to irregular time: activity clusters in time just as volatility clusters in size, the same "bursts beget bursts" structure seen two ways.

Why do Gaussian risk models blow up?

A Gaussian risk model (value-at-risk under normality, or any "N-sigma" rule) assigns negligible probability to large moves. Because real tails are power-law-heavy, those "impossible" moves arrive far more often, so the model under-states both the probability and the size of losses by orders of magnitude. The events that bankrupt desks are exactly the ones a Gaussian dismisses.

Picture a Gaussian VaR that says "you'll lose more than XX only one day in a hundred." If the true tail is power-law, you breach XX several times as often, and when you breach you breach by much more: the power-law tail has no thin cutoff, so the loss beyond the threshold is itself heavy. The model is wrong in two directions at once, and both directions are fatal: it under-counts the breaches and under-sizes them.

The Gaussian under-counts extremes by factors of thousands. A 5σ-5\sigma daily move has Gaussian probability about one day in 3.5 million, once in roughly 14,000 years of trading. Markets deliver multiple 5σ\ge 5\sigma (Gaussian-scaled) days per decade.
P(r<5σ)    2.9×107    13,500,000P(r \lt -5\sigma) \;\approx\; 2.9 \times 10^{-7} \;\approx\; \tfrac{1}{3{,}500{,}000}

October 1987 was a roughly 20σ-20\sigma event under the then-prevailing Gaussian volatility, a number so absurd it proves the model was wrong, not the market. What to use instead: model the tails explicitly. Student-t or generalised-error innovations; Extreme Value Theory (the generalised Pareto distribution for peaks-over-threshold) for the tail itself; expected shortfall (CVaR), which integrates the tail, rather than VaR, which only points at its edge; and historical or filtered-historical simulation rather than a parametric Gaussian. The honest posture is to size positions for the tail you cannot see, not the body you can.

This is the engine beneath the risk and performance topics. A Sharpe ratio or a VaR computed under normality flatters a fat-tailed strategy right up until the tail arrives, which is why the same non-normality reappears in risk-adjusted ratios (the Sharpe uses a standard deviation that under-weights extremes) and in measuring HFT risk (sizing for the tail).

Worked example

A return-distribution comparison you can reproduce in the explorer above, as of 2026. Generate a seeded tick path (a Brownian mid plus a low-intensity jump component) then aggregate the same path to tick, 1-second, 1-minute and daily returns. Because the jumps are baked in by construction, the true distribution is known and savagely fat at high frequency; the figures below are illustrative and synthetic, meant to make the mechanism legible rather than to quote as facts about any market.

Excess kurtosis falls as you aggregate, aggregational Gaussianity in action:

Kurtosis by horizon: violently fat at tick, thinning steadily as the path is aggregated to longer returns, but still well above the Gaussian zero even at daily.
κtick18,κ1s9,κ1min4,κdaily1\kappa_{\text{tick}} \approx 18, \quad \kappa_{\text{1s}} \approx 9, \quad \kappa_{\text{1min}} \approx 4, \quad \kappa_{\text{daily}} \approx 1

Now the headline. Fit a Normal to the tick returns. It assigns a 5σ-5\sigma tick return a probability of about 2.9×1072.9 \times 10^{-7}, one in roughly 3.5 million ticks. The empirical frequency of 5σ\ge 5\sigma-magnitude tick returns in the sample is, say, one in about 12,000:

The empirical tail is roughly 300 times heavier than the fitted Gaussian allows, and a Student-t with about 3.5 degrees of freedom matches the tick tails on the QQ plot where the Normal under-clothes them by orders of magnitude.
Pempirical(r5σ)PGaussian(r5σ)    1/12,0001/3,500,000    300\frac{P_{\text{empirical}}(|r| \ge 5\sigma)}{P_{\text{Gaussian}}(|r| \ge 5\sigma)} \;\approx\; \frac{1/12{,}000}{1/3{,}500{,}000} \;\approx\; 300

The lesson is direct: a VaR or position-sizing rule built on the fitted Normal would be breached hundreds of times more often than its stated confidence level, and the breaches would be larger than it can even represent. Reverify tail indices and kurtosis against real return series before relying on them; empirical tail indices and kurtosis are asset-, venue- and period-specific, and these synthetic figures exist only to make the mechanism legible. The explorer lets you change sampling frequency, the degrees of freedom of the t-overlay and the jump intensity, and watch kurtosis and the tail probabilities respond.

Where this fits