Commodity Pricing Models
Introduction
The Black-Scholes model assumes that stock prices follow a Geometric Brownian motion with a constant drift. The model can be modified to account for a constant dividend yield. The risk-neutral dynamics for the stock price in this case are given by \frac{dS}{S} = (r - q) dt + \sigma dB_{S}^{*}, where I use q to denote the dividend yield paid by the stock. The futures price in this case is just F(T) = \operatorname{E}^{*} S_{T} = S e^{(r - q) T}. This formula follows from a simple no-arbitrage argument. To replicate the payoff of the futures contract, an investor can borrow S at the risk-free rate r, purchase one unit of the asset, and collect the dividend yield q over the life of the contract. The net cost of delivering the asset at expiration is Se^{(r-q)T}, which must equal the futures price to rule out arbitrage.
If r > q, the futures price will always be an increasing function of T, whereas if r < q the function will be decreasing. Even though for stocks this assumption could be a good approximation of what we observe in practice, for commodities we typically see the slope of the term-structure of futures prices to change sign over time.
One explanation for a time-varying slope in the term-structure of future prices is that the implicit dividend that accrues to the owner of the physical commodity might vary over time. We call this implicit dividend the convenience yield of the commodity. For many commodities, it represents the fact that the commodity can be put to use in productive activities and therefore is valuable to have it in storage. For example, a refinery that holds crude oil in inventory avoids costly production shutdowns in the event of a supply disruption, effectively earning an implicit return from physical ownership that a holder of a paper futures contract does not receive.
A simple way to introduce time-variation in the convenience yield is to assume that it follows a mean-reverting process. If the convenience yield of the commodity is too high, then high cost producers will start extracting the commodity increasing supply and reducing the value of physically owning the commodity. On the contrary, if the convenience yield is too low, supply decreases and therefore the value of owning the physical commodity increases. The convenience yield is therefore an implicit dividend net of storage costs.
A simple model that captures these ideas was first introduced by Gibson and Schwartz (1990). In the model, the commodity spot price S follows a geometric Brownian motion and the convenience yield q follows an Ornstein-Uhlenbeck process such that \begin{aligned} \frac{dS}{S} & = (\mu_{S} - q) dt + \sigma_{S} dB_{S}, \\ dq & = \kappa (\bar{q} - q) dt + \sigma_{q} dB_{q}, \end{aligned} where dB_{S} dB_{q} = \rho_{s,q} dt. Note that the process for q causes the convenience yield to revert back to its long-run value \bar{q}. The speed of this mean-reversion is determined by \kappa, but also by how volatile the convenience yield is.
The paper by Gibson and Schwartz (1990) was one of the first multifactor pricing models to introduce variation not only in the spot price but also in the dividend yield.
An alternative two-factor model of commodity prices was introduced by Schwartz and Smith (2000). The motivation for their model is different from Gibson and Schwartz (1990). Schwartz and Smith (2000) assume that the log-spot price is subject to two types of shocks, permanent and temporary shocks. In the Schwartz and Smith (2000) model, permanent shocks are modeled by an arithmetic Brownian motion whereas the temporary shocks mean-revert to a zero mean. Permanent shocks capture long-run shifts in the equilibrium price level, such as those driven by technological change in extraction methods or major geopolitical events, while temporary shocks capture transitory supply and demand imbalances that are expected to self-correct over time.
More specifically, in the Schwartz and Smith (2000) model we have that \ln(S) = x + y, where x denotes a permanent shock and y denotes a temporary shock. The dynamics of these processes are described by \begin{aligned} dx & = \mu_{x} dt + \sigma_{x} dB_{x}, \\ dy & = - \kappa y dt + \sigma_{y} dB_{y}, \end{aligned} where dB_{x} dB_{y} = \rho_{x, y} dt.
Even though the models look different, it turns out that they are equivalent. Dai and Singleton (2000) show that many multifactor models of interest rates can be rotated and translated to produce equivalent models. It turns out that the same is true for multifactor models of commodity prices. In the appendix I show how the parameters and Brownian motions of the model of Schwartz and Smith (2000) can be written in terms of the parameters and Brownian motions of the Gibson and Schwartz (1990) model. This equivalence implies that both models produce identical futures prices and are therefore observationally indistinguishable using futures data alone.
In the following, I will solve for the futures price in the Schwartz and Smith (2000) model since their way of writing the model makes it easier to present the solution method.
The Model
Recall that x captures the permanent component of \ln S and evolves as an arithmetic Brownian motion with drift, while y captures the temporary component and mean-reverts to zero. In this section we derive the moments of x_{T}, y_{T}, and hence S_{T} under the physical measure \operatorname{P}, which will serve as the building blocks for pricing futures contracts once we adjust for risk.
Consider two independent Brownian motions B_{x} and B_{z} defined on the probability space (\Omega, \mathcal{F}, \operatorname{P}), and define B_{y} = \rho_{x, y} B_{x} + \sqrt{1 - \rho_{x, y}^{2}} B_{z}. Then B_{y} is a Brownian motion such that dB_{x} dB_{y} = \rho_{x, y} dt.
Consider first the arithmetic Brownian motion process for x that in the model is given by dx = \mu_{x} dt + \sigma_{x} dB_{x}. Then, x_{T} = x_{0} + \mu_{x} T + \sigma_{x} B_{x T}. Since B_{x, T} is normally distributed with mean 0 and variance T, we have that x_{T} is normal with mean and variance given by \begin{aligned} \operatorname{E}(x_{T}) & = x_{0} + \mu_{x} T, \\ \operatorname{V}(x_{T}) & = \sigma_{x}^{2} T. \end{aligned}
Now consider the Ornstein–Uhlenbeck process for y which in the model is given by dy = -\kappa y dt + \sigma_{y} dB_{y}. To solve for y_{T}, introduce the integrating factor z_{t} = y_{t} e^{\kappa t}, the standard device for solving linear SDEs. Applying Ito’s lemma to z we find dz = e^{\kappa t} dy + \kappa y e^{\kappa t} dt = \sigma_{y} e^{\kappa t} dB_{y}. Integrating both sides from 0 to T, we find z_{T} = z_{0} + \sigma_{y} \int_{0}^{T} e^{\kappa t} dB_{y t}, or in terms of y we can write y_{T} = y_{0} e^{-\kappa T} + \sigma_{y} e^{-\kappa T} \int_{0}^{T} e^{\kappa t} dB_{y t}. Because \int_{0}^{T} e^{\kappa t} dB_{y t} is normal with mean 0 and variance \int_{0}^{T} e^{2 \kappa t} (dB_{y t})^{2} = \int_{0}^{T} e^{2 \kappa t} dt = \frac{e^{2 \kappa T} - 1}{2 \kappa}, the future value of y at time T is also normal with mean and variance given by \begin{aligned} \operatorname{E}(y_{T}) & = y_{0} e^{-\kappa T} \\ \operatorname{V}(y_{T}) & = \sigma_{y}^{2} \frac{1 - e^{- 2 \kappa T}}{2 \kappa}. \end{aligned}
Finally, we can compute the covariance between x_{T} and y_{T} as \begin{aligned} \operatorname{Cov}(x_{T}, y_{T}) & = \sigma_{x} \sigma_{y} e^{-\kappa T} \int_{0}^{T} e^{\kappa t} dB_{xt} dB_{yt} \\ & = \sigma_{x} \sigma_{y} e^{-\kappa T} \int_{0}^{T} e^{\kappa t} \rho_{x, y} dt \\ & = \rho_{x, y} \sigma_{x} \sigma_{y} \frac{1 - e^{-\kappa T}}{\kappa}. \end{aligned}
Because x_{T} and y_{T} are jointly normal, we have that x_{T} + y_{T} is normally distributed. Thus, \begin{aligned} \operatorname{E}(S_{T}) & = \operatorname{E}(e^{x_{T} + y_{T}}) \\ & = \exp\left(\operatorname{E}(x_{T} + y_{T}) + \frac{1}{2} \operatorname{V}(x_{T} + y_{T})\right) \\ & = \exp\left( \operatorname{E}(x_{T}) + \operatorname{E}(y_{T}) + \frac{1}{2} \operatorname{V}(x_{T}) + \frac{1}{2} \operatorname{V}(y_{T}) + \operatorname{Cov}(x_{T}, y_{T}) \right) \\ & = \exp\left( x_{0} + y_{0} e^{-\kappa T} + \mu_{x} T + \frac{1}{2} \sigma_{x}^{2} T + \frac{1}{2} \sigma_{y}^{2} \frac{1 - e^{- 2 \kappa T}}{2 \kappa} + \rho_{x, y} \sigma_{x} \sigma_{y} \frac{1 - e^{-\kappa T}}{\kappa} \right). \end{aligned}
Also, note that because in the model \ln(S_{T}) is normal, we can easily answer questions such as what is the probability that the commodity price at time T will be greater than K. Indeed, \begin{aligned} \operatorname{P}(S_{T} > K) & = \operatorname{P}(\ln S_{T} > \ln K) \\ & = \operatorname{P}\left( Z > \frac{\ln K - \operatorname{E}\ln(S_{T})}{\sqrt{\operatorname{V}\ln(S_{T})}} \right) \\ & = \operatorname{P}\left( Z < \frac{\operatorname{E}\ln(S_{T}) - \ln K}{\sqrt{\operatorname{V}\ln(S_{T})}} \right), \end{aligned} where Z \sim \mathcal{N}(0, 1).
Adjusting for Risk
In order to be able to use the model to price futures contracts, we need first to adjust for risk. Ideally, we would like to adjust each of the drift parameters in x and y for risk. In their original paper, Schwartz and Smith (2000) only adjust \mu and the level of y, which corresponds to restricting \lambda_{1z} = 0 and therefore \kappa^{*} = \kappa, so that the mean-reversion speed is the same under both measures. We can easily also adjust \kappa by introducing a time-varying market price of risk for B_{z}.
In order to do this, let \frac{d\Lambda}{\Lambda} = -r dt - \lambda_{x} dB_{x} - (\lambda_{0z} + \lambda_{1z} y) dB_{z}, be the stochastic discount factor. We assume that both Brownian motions B_{x} and B_{z} are spanned by existing traded contracts so that \lambda_{x}, \lambda_{0z} and \lambda_{1z} are uniquely identified.
Let \mathcal{E} = \Lambda \beta where \beta_{t} = \beta_{0} e^{\int_{0}^{t} r_{s} ds}. Since a futures contract requires no initial investment, its price does not depend on the level or dynamics of the interest rate. We therefore do not need to specify the dynamics of r beyond the requirement that \mathcal{E} be a martingale under \operatorname{P}. Applying Ito’s lemma to \mathcal{E} we find that, \frac{d\mathcal{E}}{\mathcal{E}} = - \lambda_{x} dB_{x} - (\lambda_{0z} + \lambda_{1z} y) dB_{z}. Thus, \frac{d\mathcal{E}}{\mathcal{E}} dB_{x} = -\lambda_{x} dt, and \begin{aligned} \frac{d\mathcal{E}}{\mathcal{E}} dB_{y} & = \frac{d\mathcal{E}}{\mathcal{E}} \left( \rho_{x, y} dB_{x} + \sqrt{1 - \rho_{x, y}^{2}} dB_{z} \right) \\ & = - \lambda_{x} \rho_{x, y} dt - \sqrt{1 - \rho_{x, y}^{2}} (\lambda_{0z} + \lambda_{1z} y) dt \\ & = - \left( \lambda_{x} \rho_{x, y} + \sqrt{1 - \rho_{x, y}^{2}} \lambda_{0z} \right) dt - \sqrt{1 - \rho_{x, y}^{2}} \lambda_{1z} y \, dt \\ & = - (\lambda_{0y} + \lambda_{1y} y) dt, \end{aligned} where \lambda_{0y} = \lambda_{x} \rho_{x, y} + \sqrt{1 - \rho_{x, y}^{2}} \lambda_{0z} and \lambda_{1y} = \sqrt{1 - \rho_{x, y}^{2}} \lambda_{1z}.
According to Girsanov’s theorem, \begin{aligned} B_{xt}^{*} & = B_{xt} + \lambda_{x} t, \\ B_{yt}^{*} & = B_{yt} + (\lambda_{0y} + \lambda_{1y} y) t, \end{aligned} are \operatorname{P}^{*}-Brownian motions where the measure \operatorname{P}^{*} is defined through its Radon-Nikodym derivative as \frac{d\operatorname{P}^{*}}{d\operatorname{P}} = \mathcal{E}_{T}. The risk-adjusted processes for x and y are then \begin{aligned} dx & = \mu_{x}^{*} dt + \sigma_{x} dB_{x}^{*}, \\ dy & = (- \kappa^{*} y - \lambda^{*}) dt + \sigma_{y} dB_{y}^{*}, \end{aligned} where \mu_{x}^{*} = \mu_{x} - \sigma_{x} \lambda_{x}, \lambda^{*} = \sigma_{y} \lambda_{0y}, and \kappa^{*} = \kappa + \sigma_{y} \lambda_{1y}.
Solving for the Futures Price
The futures price expiring at T is the expected spot price under the risk-neutral measure, i.e., F(T) = \operatorname{E}^{*} (S_{T}). Since the change of measure only changes constant coefficients into risk-adjusted constant coefficients, the log of the spot price is still normally distributed under the risk-neutral measure \operatorname{P}^{*}. Thus, the futures price expiring at T is just F(T) = \exp\left( \operatorname{E}^{*}(x_{T}) + \operatorname{E}^{*}(y_{T}) + \frac{1}{2} \operatorname{V}^{*}(x_{T}) + \frac{1}{2} \operatorname{V}^{*}(y_{T}) + \operatorname{Cov}^{*}(x_{T}, y_{T}) \right). The only difference between \operatorname{E}^{*}(x_{T}), \operatorname{V}^{*}(x_{T}), \operatorname{V}^{*}(y_{T}), \operatorname{Cov}^{*}(x_{T}, y_{T}) and their unstarred counterparts is that starred moments use \mu^{*} and \kappa^{*} instead of \mu and \kappa. So we have that \begin{aligned} \operatorname{E}^{*}(x_{T}) & = x_{0} + \mu_{x}^{*} T, \\ \operatorname{V}^{*}(x_{T}) & = \sigma_{x}^{2} T, \\ \operatorname{V}^{*}(y_{T}) & = \sigma_{y}^{2} \frac{1 - e^{- 2 \kappa^{*} T}}{2 \kappa^{*}}, \\ \operatorname{Cov}^{*}(x_{T}, y_{T}) & = \rho_{x, y} \sigma_{x} \sigma_{y} \frac{1 - e^{-\kappa^{*} T}}{\kappa^{*}}. \end{aligned}
The only starred moment that is different is \operatorname{E}^{*}(y_{T}) since now the risk-adjusted process for y has an extra component given by \lambda^{*}. If we follow the same method used to solve for y_{T} under \operatorname{P}, we find that y_{T} = y_{0} e^{-\kappa^{*} T} - \lambda^{*} \frac{1 - e^{-\kappa^{*} T}}{\kappa^{*}} + \sigma_{y} e^{-\kappa^{*} T} \int_{0}^{T} e^{\kappa^{*} t} dB_{y t}^{*}. Therefore, \operatorname{E}^{*}(y_{T}) = y_{0} e^{-\kappa^{*} T} - \lambda^{*} \frac{1 - e^{-\kappa^{*} T}}{\kappa^{*}}. The futures price in the Schwartz and Smith (2000) model is then F(T) = \exp\left( x_{0} + y_{0} e^{-\kappa^{*} T} + \mu_{x}^{*} T - \lambda^{*} \frac{1 - e^{-\kappa^{*} T}}{\kappa^{*}} + \frac{1}{2} \sigma_{x}^{2} T + \frac{1}{2} \sigma_{y}^{2} \frac{1 - e^{- 2 \kappa^{*} T}}{2 \kappa^{*}} + \rho_{x, y} \sigma_{x} \sigma_{y} \frac{1 - e^{-\kappa^{*} T}}{\kappa^{*}} \right).
The formula reveals how each component shapes the term structure of futures prices. For short maturities, the temporary component y_{0} e^{-\kappa^{*} T} is the dominant force: if y_{0} > 0 the market is in backwardation (futures prices decrease with maturity), whereas if y_{0} < 0 the market is in contango (futures prices increase with maturity). As T \to \infty, the temporary component vanishes and the futures price grows at the rate determined by the permanent drift \mu_{x}^{*}. The speed at which the market transitions between these short-run and long-run regimes is governed by the risk-adjusted mean-reversion speed \kappa^{*}.
Appendix
In this appendix I show the equivalence between the models of Gibson and Schwartz (1990) and Schwartz and Smith (2000). We start with the model of Gibson and Schwartz (1990), \begin{aligned} \frac{dS}{S} & = (\mu - q) dt + \sigma_{S} dB_{S}, \\ dq & = \kappa (\bar{q} - q) dt + \sigma_{q} dB_{q}, \end{aligned} where dB_{S} dB_{q} = \rho_{s,q}.
We know that \ln(S) follows an arithmetic \operatorname{P}-Brownian motion d\ln(S) = \frac{dS}{S} - \frac{1}{2}\left(\frac{dS}{S}\right)^{2} = \left( \mu - \frac{1}{2} \sigma_{S}^{2} \right) dt - q dt + \sigma_{S} dB_{S}. Define z = q - \bar{q}. Then, \begin{aligned} d\ln(S) & = \left( \mu - \frac{1}{2} \sigma_{S}^{2} - \bar{q} \right) dt - z dt + \sigma_{S} dB_{S}, \\ dz & = - \kappa z dt + \sigma_{q} dB_{q}. \end{aligned} Since - z dt = \frac{dz - \sigma_{q} dB_{q}}{\kappa}, we can write d\ln(S) as d\ln(S) = \left( \mu - \frac{1}{2} \sigma_{S}^{2} - \bar{q} \right) dt + \frac{dz}{\kappa} + \sigma_{S} dB_{S} - \frac{\sigma_{q}}{\kappa} dB_{q}. Let y = z / \kappa. Then, dy = \frac{dz}{\kappa} = -z dt + \frac{\sigma_{q}}{\kappa} dB_{q} = - \kappa y dt + \frac{\sigma_{q}}{\kappa} dB_{q}, where the last equality uses z = \kappa y. Define dx = \mu_{x} dt + \sigma_{x} dB_{x}, where \begin{aligned} \mu_{x} & = \mu - \frac{1}{2} \sigma_{S}^{2} - \bar{q}, \\ \sigma_{x} & = \sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}, \\ dB_{x} & = \frac{1}{\sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}} \left( \sigma_{S} dB_{S} - \frac{\sigma_{q}}{\kappa} dB_{q} \right). \end{aligned} Clearly, B_{x} is a Brownian motion such that dB_{x} dB_{q} = \frac{1}{\sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}} \left( \rho_{S, q} \sigma_{S} - \frac{\sigma_{q}}{\kappa} \right) dt.
In the Schwartz and Smith (2000) model we have that \begin{aligned} \ln(S) & = x + y \\ dx & = \mu_{x} dt + \sigma_{x} dB_{x}, \\ dy & = - \kappa y dt + \sigma_{y} dB_{y}, \end{aligned} where dB_{x} dB_{y} = \rho_{x, y} dt.
Therefore, the parameters and Brownian motions in the Schwartz and Smith (2000) model can be defined in terms of the parameters and Brownian motions in the Gibson and Schwartz (1990) model as \begin{aligned} \mu_{x} & = \mu - \frac{1}{2} \sigma_{S}^{2} - \bar{q}, \\ \sigma_{x} & = \sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}, \\ \sigma_{y} & = \frac{\sigma_{q}}{\kappa}, \\ B_{x} & = \frac{1}{\sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}} \left( \sigma_{S} B_{S} - \frac{\sigma_{q}}{\kappa} B_{q} \right), \\ B_{y} & = B_{q}, \\ \rho_{x, y} & = \frac{1}{\sqrt{\sigma_{S}^{2} + \sigma_{q}^{2} / \kappa^{2}}} \left( \rho_{S, q} \sigma_{S} - \frac{\sigma_{q}}{\kappa} \right). \end{aligned}