Brownian Motion
The Simple Random Walk Again
In this section, we revisit the symmetric random walk discussed in previous lectures. This process models the outcome of repeatedly flipping a fair coin: you gain $1 for heads and lose $1 for tails. Each flip is independent and has expected value zero, so the cumulative gains form a martingale.
Let \{S_{n}\} denote the position after n steps, defined recursively by S_{n+1} = S_{n} + X_{n+1}, where each X_{n+1} is an independent random variable taking values +1 or -1 with probability 1/2 each. Thus, the sequence \{X_{n}\} is iid with \operatorname{E}(X_{n}) = 0.5 \times 1 + 0.5 \times (-1) = 0, and \operatorname{V}(X_{n}) = 0.5 \times (1 - 0)^2 + 0.5 \times (-1 - 0)^2 = 1.
For any 0 \le m < n, we have S_{n} = S_{m} + \sum_{i = m + 1}^{n} X_{i}, so the increment S_{n} - S_{m} depends only on the coin flips between times m+1 and n. Moreover, increments over disjoint intervals are independent.
The process \{S_{n}\} is a martingale: for m < n, \operatorname{E}(S_{n} \mid \mathcal{F}_{m}) = S_{m} + \operatorname{E}\left(\sum_{i = m + 1}^{n} X_{i} \mid \mathcal{F}_{m}\right) = S_{m}, since the future coin flips are independent of the past.
The variance of the increment is \operatorname{V}(S_{n} - S_{m}) = \operatorname{V}\left(\sum_{i = m + 1}^{n} X_{i}\right) = \sum_{i = m + 1}^{n} \operatorname{V}(X_{i}) = n - m.
Finally, the quadratic variation of the simple symmetric random walk up to time n is [S, S]_{n} = \sum_{i = 1}^{n} (S_{i} - S_{i -1})^2 = \sum_{i = 1}^{n} X_{i}^2 = n, since each X_{i}^2 = 1. Unlike variance, quadratic variation is computed path-by-path, not as an average over many realizations.
A Scaled Random Walk
Let’s now embed the symmetric random walk into a finite time interval and scale it so that the variance over any interval [0, t] equals t.
Let \Delta t = T / n, and define discrete time points: t_{0} = 0, t_{1} = \Delta t, t_{2} = 2 \Delta t, …, so that t_{n} = T. At each step, instead of moving by +1 or -1, we move by +\sqrt{\Delta t} or -\sqrt{\Delta t}. Thus, we define the scaled random walk as B_{t_{m}}^{(n)} = \sum_{j = 1}^{m} \sqrt{\Delta t} \, X_{j}, with B_{t_{0}}^{(n)} = 0, and each X_{i} is independent and takes values +1 or -1 with probability 1 / 2.
The expected value and variance are: \begin{aligned} \operatorname{E}(B_{t_{m}}^{(n)}) &= \sum_{j = 1}^{m} \sqrt{\Delta t} \, \operatorname{E}(X_{j}) = 0, \\ \operatorname{V}(B_{t_{m}}^{(n)}) &= \sum_{j = 1}^{m} \Delta t \, \operatorname{V}(X_{j}) = n \Delta t = t_{m}. \end{aligned}
The quadratic variation up to time 0 \le t_{m} \le T is [B^{(n)}, B^{(n)}]_{t_{m}} = \sum_{j = 0}^{n - 1} (B_{t_{j + 1}}^{(n)} - B_{t_{j}}^{(n)})^2 = n \Delta t = t_{n}.
This scaling ensures that as n increases, the process has variance proportional to elapsed time, matching the behavior of a Brownian motion.
Limiting Distribution of the Scaled Random Walk
Remember that T = t_{n} = n \Delta t. To analyze the limiting distribution of B_{T}^{(n)} as n \to \infty (i.e., as \Delta t \to 0), we use the characteristic function, which uniquely determines the distribution of a random variable. The characteristic function of X is \phi_{X}(u) = \operatorname{E}[e^{i u X}], where i is the imaginary unit (i^{2} = -1). For a normal random variable X \sim \mathcal{N}(\mu, \sigma^{2}), the characteristic function is \phi_{X}(u) = e^{i u \mu - \frac{1}{2} u^{2} \sigma^{2}}.
Now, consider the scaled random walk B_{T}^{(n)} = \sum_{k = 1}^{n} \sqrt{\Delta t} \, X_{k}, where each X_{k} is independent and takes values \pm 1 with probability 1 / 2. Its characteristic function is \begin{aligned} \operatorname{E}(e^{i u B_{T}}) & = \operatorname{E}\left(e^{i u \sum_{j = 1}^{n} \sqrt{\Delta t}\, X_{j}}\right) = \prod_{j = 1}^{n} \operatorname{E}\left(e^{i u \sqrt{\Delta t}\, X_{j}}\right) = \left(\frac{e^{i u \sqrt{\Delta t}} + e^{-i u \sqrt{\Delta t}}}{2}\right)^{n} \\ & \approx \left(\frac{1 + i u \sqrt{\Delta t} - \frac{1}{2} u^{2} \Delta t + 1 - i u \sqrt{\Delta t} - \frac{1}{2} u^{2} \Delta t}{2}\right)^{n} \\ & = \left(1 + \frac{- \frac{1}{2} u^{2} T}{n}\right)^{n}. \end{aligned}
Therefore, \lim_{n \to \infty} \operatorname{E}[e^{i u B_{T}}] = \lim_{n \to \infty} \left(1 + \frac{- \frac{1}{2} u^{2} T}{n}\right)^{n} = e^{-\frac{1}{2} u^2 T}, which is the characteristic function of a normal distribution with mean 0 and variance T. Thus, as n \to \infty, the scaled random walk B_{T}^{(n)} converges in distribution to \mathcal{N}(0, T).
Brownian Motion
We obtain Brownian motion as the limit of the scaled random walks as n \to \infty. The Brownian motion inherits properties from these random walks. This leads to the following definition.
For any t \geq 0 we have \operatorname{E}(B_{t}) = \operatorname{E}(B_{t} - B_{0}) = 0, and for 0 \le s \le t we have \begin{aligned} \operatorname{Cov}(B_{s}, B_{t}) & = \operatorname{Cov}(B_{s}, B_{t} - B_{s} + B_{s}) \\ & = \operatorname{Cov}(B_{s}, B_{t} - B_{s}) + \operatorname{Cov}(B_{s}, B_{s}) \\ & = 0 + s = s. \end{aligned} Thus, for any s, t \geq 0 it must be the case that \operatorname{Cov}(B_{s}, B_{t}) = \operatorname{E}(B_{s} B_{t}) = \min(s, t). This covariance structure is enough to characterize the Brownian motion as well. It is in fact one way to rigorously construct the Brownian motion process by noting that for 0 \le s \le 1 and 0 \le t \le 1 we have \operatorname{E}(B_{s} B_{t}) = \int_{0}^{1} \mathbf{1}_{\{[0, s]\}}(u) \mathbf{1}_{\{[0, t]\}}(u) du = \min(s, t). The previous relationship establishes an isometry between the Hilbert space of normal random variables \mathcal{H} \in \mathcal{L}^{2}(\operatorname{P}) with inner product \langle X, Y\rangle = \operatorname{E}(X, Y), and the Hilbert space \mathcal{L}^{2}[0, 1] with inner product \langle f, g \rangle = \int_{0}^1 f(u) g(u) du.
We can then show that for any 0 \le t \le 1 we have B_{t} = \sum_{k = 0}^{\infty} \langle \phi_{k}, \mathbf{1}_{\{[0, t]\}} \rangle Z_{k}, where \{Z_{k}: 0 \le k < \infty\} is a sequence of independent \mathcal{N}(0, 1) random variables, and \phi_{k}: 0 \le k < \infty is an orthonormal basis of \mathcal{L}^{2}[0, 1]. By using the Haar basis in the previous expression, it is possible to show that the series representation on the right generates a Brownian motion.
Total and Quadratic Variation
Total Variation of a Function
To study the variation of a function over an interval, we begin by dividing it into smaller pieces. Consider a partition of the time interval [0, T] given by \Pi = \{t_{0}, t_{1}, \ldots, t_{n}\}, where the partition points satisfy 0 = t_{0} < t_{1} < \ldots < t_{n} = T. The mesh (or norm) of the partition \Pi is defined as the length of the longest subinterval: \|\Pi\| = \max_{j = 0, 1, \ldots, n-1} (t_{j + 1} - t_{j}). As we refine the partition by adding more points, the mesh \|\Pi\| decreases and approaches zero.
The total variation of a function f over the interval [0, T] measures the total “distance traveled” by the function. It is defined as V_{T}(f) = \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n + 1} |f(t_{j + 1}) - f(t_{j})|. In words, we partition the interval into smaller pieces, sum the absolute changes in f across each piece, and then take the limit as the partition becomes arbitrarily fine.
For differentiable functions, the total variation has a simple integral representation. By the mean-value theorem, for each subinterval [t_{j}, t_{j+1}] there exists a point t_{j}^{*} \in [t_{j}, t_{j + 1}] where the derivative equals the average rate of change: \frac{f(t_{j + 1}) - f(t_{j})}{t_{j + 1} - t_{j}} = f'(t_{j}^{*}). Rearranging and taking absolute values gives \sum_{j = 0}^{n + 1} |f(t_{j + 1}) - f(t_{j})| = \sum_{j = 0}^{n + 1} |f'(t_{j}^{*})| (t_{j + 1} - t_{j}). The right-hand side is a Riemann sum for the integral of |f'(t)|. Taking the limit as the mesh goes to zero yields V_{T}(f) = \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n + 1} |f'(t_{j}^{*})| (t_{j + 1} - t_{j}) = \int_{0}^{T} |f'(t)| dt. Thus, for differentiable functions, total variation equals the integral of the absolute value of the derivative—a quantity that is always finite when f' is integrable.
Quadratic Variation of a Function
While total variation measures the absolute distance traveled by a function, quadratic variation measures the sum of squared changes. For a function f over the interval [0, T], the quadratic variation is defined as [f, f]_{T} = \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n + 1} (f(t_{j + 1}) - f(t_{j}))^{2}.
For continuously differentiable functions, the quadratic variation vanishes. To see why, we again apply the mean-value theorem: for each subinterval there exists a point t_{j}^{*} \in [t_{j}, t_{j+1}] such that f(t_{j + 1}) - f(t_{j}) = f'(t_{j}^{*}) (t_{j + 1} - t_{j}). Squaring both sides and summing over all subintervals gives \sum_{j = 0}^{n + 1} (f(t_{j + 1}) - f(t_{j}))^{2} = \sum_{j = 0}^{n + 1} (f'(t_{j}^{*}))^{2} (t_{j + 1} - t_{j})^{2}. Since (t_{j + 1} - t_{j}) \le \|\Pi\| for all j, we can factor out one power of the mesh: \sum_{j = 0}^{n + 1} (f(t_{j + 1}) - f(t_{j}))^{2} \le \|\Pi\| \sum_{j = 0}^{n + 1} (f'(t_{j}^{*}))^{2} (t_{j + 1} - t_{j}). The sum on the right is a Riemann sum for \int_{0}^{T} (f'(t))^{2} dt, which converges to this integral as \|\Pi\| \to 0. Therefore, [f, f]_{T} \le \lim_{\|\Pi\| \to 0} \|\Pi\| \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n + 1} (f'(t_{j}^{*}))^{2} (t_{j + 1} - t_{j}) = \lim_{\|\Pi\| \to 0} \|\Pi\| \int_{0}^{T} (f'(t))^{2} dt = 0. In other words, for smooth functions the quadratic variation is always zero because the mesh shrinks faster than the Riemann sum can accumulate.
Quadratic Variation of Brownian Motion
Unlike smooth functions whose quadratic variation is zero, Brownian motion has non-trivial quadratic variation that accumulates linearly over time. We now prove that the quadratic variation of Brownian motion over [0, T] converges to T in mean square.
Analysis
Consider a partition \Pi of [0, T] and define the quadratic variation sum: Q_{\Pi} = \sum_{j = 0}^{n - 1} (B_{t_{j + 1}} - B_{t_{j}})^{2}.
Our strategy is to show two things: first, that the expected value of Q_{\Pi} equals T for any partition; second, that the variance of Q_{\Pi} vanishes as the mesh size goes to zero. Together, these facts imply convergence in L^{2}(\operatorname{P}).
Step 1: Expected value of Q_{\Pi}.
Since each increment B_{t_{j+1}} - B_{t_j} is normally distributed with mean zero and variance t_{j+1} - t_j, we have
\operatorname{E}(Q_{\Pi}) = \sum_{j = 0}^{n - 1} \operatorname{E}(B_{t_{j + 1}} - B_{t_{j}})^{2} = \sum_{j = 0}^{n - 1} (t_{j + 1} - t_{j}) = t_{n} - t_{0} = T.
Thus, regardless of how we partition the interval, the expected quadratic variation is always T.
Step 2: Variance of Q_{\Pi}.
To show that Q_{\Pi} concentrates around its mean as \|\Pi\| \to 0, we compute its variance. For a standard normal random variable Z \sim \mathcal{N}(0,1), we have \operatorname{E}(Z^4) = 3. Since (B_{t_{j+1}} - B_{t_j})/\sqrt{t_{j+1} - t_j} \sim \mathcal{N}(0,1), it follows that
\begin{aligned}
\operatorname{V}(B_{t_{j + 1}} - B_{t_{j}})^{2} & = \operatorname{E}(B_{t_{j + 1}} - B_{t_{j}})^{4} - \left(\operatorname{E}(B_{t_{j + 1}} - B_{t_{j}})^{2}\right)^{2} \\
& = 3 (t_{j + 1} - t_{j})^{2} - (t_{j + 1} - t_{j})^{2} = 2 (t_{j + 1} - t_{j})^{2}.
\end{aligned}
Because Brownian increments over disjoint intervals are independent, the variance of their sum equals the sum of their variances:
\operatorname{V}(Q_{\Pi}) = \sum_{j = 0}^{n - 1} \operatorname{V}(B_{t_{j + 1}} - B_{t_{j}})^{2} = 2 \sum_{j = 0}^{n - 1} (t_{j + 1} - t_{j})^{2}.
Each difference (t_{j+1} - t_j) is bounded by the mesh \|\Pi\|, so we can write \operatorname{V}(Q_{\Pi}) = 2 \sum_{j = 0}^{n - 1} (t_{j + 1} - t_{j})^{2} \le 2 \|\Pi\| \sum_{j = 0}^{n - 1} (t_{j + 1} - t_{j}) = 2 \|\Pi\| \, T. As the partition is refined and \|\Pi\| \to 0, the variance \operatorname{V}(Q_{\Pi}) also goes to zero: \lim_{\|\Pi\| \to 0} \operatorname{V}(Q_{\Pi}) \le 2 \lim_{\|\Pi\| \to 0} \|\Pi\| \, T = 0.
Step 3: Convergence in L^{2}(\operatorname{P}) and in probability.
Since \operatorname{E}(Q_{\Pi}) = T and \operatorname{V}(Q_{\Pi}) \to 0, the mean-squared error converges to zero:
\lim_{\|\Pi\| \to 0} \operatorname{E}(Q_{\Pi} - T)^{2} = \lim_{\|\Pi\| \to 0} \operatorname{V}(Q_{\Pi}) = 0.
This is L^2 convergence: Q_{\Pi} \to T in L^{2}(\operatorname{P}).
Moreover, by Chebyshev’s inequality, for any \varepsilon > 0, \operatorname{P}(|Q_{\Pi} - T| > \varepsilon) \le \frac{\operatorname{V}(Q_{\Pi})}{\varepsilon^{2}} \to 0 \quad \text{as } \|\Pi\| \to 0. This shows that Q_{\Pi} converges to T in probability: Q_{\Pi} \xrightarrow{\operatorname{P}} T.
The Main Results
We see that the quadratic variation of Brownian motion over [0, t] for any 0 \le t \le T equals t: [B, B]_{t} = t.
We observe that the differential of the quadratic variation, denoted as d[B, B]_{t}, is equal to dt. This relationship is often expressed in shorthand as (dB)(dB) = dt, which signifies that the sum of the square increments of Brownian motion behaves similarly to the sum of discrete time intervals. In mathematical terms, this can be represented as: \int_{0}^{T} d[B, B]_{t} = T. It is important to note, however, that the squared increments (B_{t_{j + 1}} - B_{t_{j}})^2 do not equal the time intervals t_{j + 1} - t_{j}. This distinction arises because: \frac{B_{t_{j + 1}} - B_{t_{j}}}{\sqrt{t_{j + 1} - t_{j}}} \sim \mathcal{N}(0, 1).
We can also establish that \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n - 1} (B_{t_{j + 1}} - B_{t_{j}}) (t_{j + 1} - t_{j}) \le \lim_{\|\Pi\| \to 0} M_{\Pi} \, T = 0, and \lim_{\|\Pi\| \to 0} \sum_{j = 0}^{n - 1} (t_{j + 1} - t_{j})^{2} \le \lim_{\|\Pi\| \to 0} \|\Pi\| \, T = 0. These results imply that the product of the increments of Brownian motion and the corresponding time intervals approaches zero, which we denote as (dB)(dt) = 0. Similarly, the product of two infinitesimal time intervals is also zero, expressed as (dt)(dt) = 0.
Brownian Motion is Nowhere Differentiable
Furthermore, we note that \lim_{\|\Pi\| \to 0} \max_{0 \le j \le n - 1} |B_{t_{j + 1}} - B_{t_{j}}| = 0, which follows from the uniform continuity of the Brownian motion over any finite interval. To analyze the behavior of the increments, we define M_{\Pi} = \max_{0 \le j \le n - 1} \big|B_{t_{j + 1}} - B_{t_{j}}\big|, \qquad Q_{\Pi} = \sum_{j = 0}^{n - 1} \big(B_{t_{j + 1}} - B_{t_{j}}\big)^2. For each increment, we have the inequality (B_{t_{j + 1}} - B_{t_{j}})^2 \le M_{\Pi} \, |B_{t_{j + 1}} - B_{t_{j}}|, leading to the conclusion that \sum_{j = 0}^{n - 1} \big|B_{t_{j +1}} - B_{t_{j}}\big| \; \ge \; \frac{Q_{\Pi}}{M_{\Pi}}. As we refine the partition such that \|\Pi\| \to 0, both M_{\Pi} \to 0 and Q_{\Pi} \to T. This implies that the total variation of Brownian motion over the interval [0, T] is almost surely infinite.
In particular, this result indicates that Brownian paths are almost surely nowhere differentiable. If they were differentiable at almost every point, the total variation would have to be finite.