The GARCH Model

Modeling Conditional Volatility

Even though the rolling window approach to estimate volatility is easy to understand and implement, it has some important drawbacks:

  • it gives the same weights to all observations in the window
  • it does not use all the information in the sample to estimate the volatility at any point in time

The idea of a GARCH model is to impose some additional structure to the evolution of volatility over time. Engle (1982) introduced the ARCH model to capture time-varying volatility in inflation, and Bollerslev (1986) extended it by allowing the conditional variance to depend on its own lags, giving rise to the GARCH model. One of the simplest GARCH models has the following structure: \[ \begin{align*} r_{t} & = \mu + \sigma_{t} \varepsilon_{t} \\ \sigma_{t}^{2} & = (1 - \alpha - \beta) \sigma_{L}^{2} + \alpha r_{t-1}^{2} + \beta \sigma_{t-1}^{2}, \\ \end{align*} \] where \(\varepsilon_{t} \sim N(0, 1)\) and \(\{\varepsilon_{t}\}_{t \geq 0}\) is an iid sequence of random variables and \(\sigma_{t}^{2}\) denotes the conditional variance. This model is called a GARCH(1,1) model because it uses one lagged squared return and one previous estimate of the variance.

Engle, Robert F. 1982. “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation.” Econometrica 50 (4): 987–1007.
Bollerslev, Tim. 1986. “Generalized Autoregressive Conditional Heteroskedasticity.” Journal of Econometrics 31 (3): 307–27.

The variance expression says that the conditional variance weighs three components:

  • The long run variance denoted by \(\sigma_{L}^{2}\)
  • The last squared return \(r_{t-1}^{2}\)
  • The previous variance estimate \(\sigma_{t-1}^{2}\)

The model recursively estimates the variance by updating the previous estimate with the latest squared return and by giving a certain weight to the long-run variance of the model.

In the estimation, it is usual to make the following re-parametrization \(\omega = (1 - \alpha - \beta) \sigma_{L}^{2}\) so the model for the variance becomes \[ \sigma_{t}^{2} = \omega + \alpha r_{t-1}^{2} + \beta \sigma_{t-1}^{2}. \]

Parameter Estimation

The GARCH model is usually estimated by maximizing the likelihood of the model innovations given the observed data. The i.i.d. assumption on the residuals implies that the full likelihood function is given by \[ L = \prod_{t=1}^{n} \frac{1}{\sqrt{2 \pi \sigma_{t}^{2}}} \exp\left(-\frac{1}{2} \left(\frac{r_{t} - \mu}{\sigma_{t}}\right)^{2}\right). \] Maximizing the likelihood function with respect to the model parameters is equivalent to maximizing the log-likelihood given by \[ \ln(L) = -\frac{n}{2} \ln(2 \pi) - \frac{1}{2} \sum_{t=1}^{n} \left( \ln(\sigma_{t}^{2}) + \left(\frac{r_{t} - \mu}{\sigma_{t}}\right)^{2} \right). \] Note that maximizing the above quantity is equivalent to minimizing the following sum with respect to \(\omega,\) \(\alpha,\) and \(\beta,\) \[ S = \sum_{t=1}^{n} \ln(\sigma_{t}^{2}) + \left(\frac{r_{t} - \mu}{\sigma_{t}}\right)^{2}. \]

Therefore, estimating a GARCH model by maximum likelihood is similar to OLS except that here we minimize a different sum. Unlike OLS, though, there is no closed-form solution for this minimization and an optimization library is needed to obtain the parameter estimates.

Loading the Data

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme()
start_date = '1993-01-01'
sp500 = yf.download('^GSPC', start=start_date, auto_adjust=False, progress=False, multi_level_index=False)
returns = 100 * sp500['Close'].pct_change().dropna()

Estimating the GARCH Model

Luckily for us, there is a Python library called arch that estimates the GARCH model in just one simple step. The arch library can be installed by typing in a terminal

pip install arch --upgrade

or

conda install arch

depending on which platform you are working on. The GitHub repository of the arch library is https://github.com/bashtage/arch and documentation for the library can be found at https://bashtage.github.io/arch/.

The easiest way to specify a model is by using the model constructor arch_model. The model is then estimated using fit. Setting update_freq=0 avoids displaying all iterations and only shows the optimization results.

from arch import arch_model

am = arch_model(returns)
res = am.fit(update_freq=0)
Optimization terminated successfully    (Exit mode 0)
            Current function value: 11152.623589544473
            Iterations: 14
            Function evaluations: 84
            Gradient evaluations: 14

Once the model is estimated we can display the estimation results as follows.

print(res.summary())
                     Constant Mean - GARCH Model Results                      
==============================================================================
Dep. Variable:                  Close   R-squared:                       0.000
Mean Model:             Constant Mean   Adj. R-squared:                  0.000
Vol Model:                      GARCH   Log-Likelihood:               -11152.6
Distribution:                  Normal   AIC:                           22313.2
Method:            Maximum Likelihood   BIC:                           22341.4
                                        No. Observations:                 8345
Date:                Sun, Mar 01 2026   Df Residuals:                     8344
Time:                        11:42:48   Df Model:                            1
                                 Mean Model                                 
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
mu             0.0678  8.663e-03      7.831  4.837e-15 [5.086e-02,8.482e-02]
                              Volatility Model                              
============================================================================
                 coef    std err          t      P>|t|      95.0% Conf. Int.
----------------------------------------------------------------------------
omega          0.0202  3.857e-03      5.232  1.677e-07 [1.262e-02,2.774e-02]
alpha[1]       0.1122  1.129e-02      9.938  2.850e-23   [9.008e-02,  0.134]
beta[1]        0.8721  1.199e-02     72.743      0.000     [  0.849,  0.896]
============================================================================

Covariance estimator: robust

Model coefficients are stored in res.params and can be retrieved by their names. For example, res.params['omega'] contains the numerical estimate for \(\omega.\)

Note that the daily long-run standard deviation is given by: \[ \sigma_{L} = \sqrt{\frac{\omega}{1 - \alpha - \beta}} \] which can be annualized by multiplying by \(\sqrt{252}.\)

np.sqrt(252*res.params['omega']/(1-res.params['alpha[1]']-res.params['beta[1]'])).round(2)
np.float64(17.99)

It is interesting to see that this estimate is extremely close to the constant annualized standard deviation of returns.

round(np.sqrt(252)*returns.std(), 2)
np.float64(18.38)

More information about the arch library and GARCH type models can be found here.

Plotting the Estimated Volatility of Returns

It is also possible to quickly plot the standardized residuals defined as: \[ \varepsilon_{t} = \frac{r_{t} - \mu}{\sigma_{t}}, \] and the annualized conditional volatility using plot.

res.plot(annualize="D")
plt.show()

The conditional volatility is stored in res.conditional_volatility. In the figure below I plot the annualized conditional volatility and compare it with the VIX.

The VIX is the CBOE Volatility Index, a measure of the market’s expectation of S&P 500 volatility over the next 30 calendar days (~21 trading days), derived from the prices of S&P 500 options and expressed as an annualized percentage. Because it is backed out from option prices, it reflects not just the market’s forecast of realized volatility but also the price investors are willing to pay to hedge against volatility — which is why it tends to exceed realized volatility on average.

vix = yf.download('^VIX', start=start_date, auto_adjust=False, progress=False, multi_level_index=False).loc[:, 'Close']

plt.plot(res.conditional_volatility*np.sqrt(252), label='GARCH Volatility')
plt.plot(vix, label='VIX')
plt.title('VIX vs. GARCH Conditional Volatility')
plt.legend()
plt.show()

It is interesting to see that the annualized conditional volatility tracks the VIX quite closely. The difference between the VIX and the expected realized volatility over the next 21 trading days is the volatility premium.

The VIX measures the market’s expectation of realized variance over the next 30 calendar days (~21 trading days), annualized. A proper comparison therefore requires the GARCH-implied expected realized variance over the same horizon, not just the spot conditional variance. For a GARCH(1,1) model with persistence \(p = \alpha + \beta\) and long-run variance \(\sigma_L^2 = \omega/(1-\alpha-\beta)\), the \(h\)-step ahead variance forecast is \[ E_t[\sigma_{t+h}^2] = \sigma_L^2 + p^{h-1}(\sigma_{t+1}^2 - \sigma_L^2), \] so the cumulated \(H\)-day expected variance is \[ \sum_{h=1}^{H} E_t[\sigma_{t+h}^2] = H \sigma_L^2 + (\sigma_{t+1}^2 - \sigma_L^2) \frac{1 - p^H}{1 - p}. \] Annualizing gives the expected volatility to compare against the VIX.

omega = res.params['omega']
alpha = res.params['alpha[1]']
beta  = res.params['beta[1]']

sigma_L2 = omega / (1 - alpha - beta)
p = alpha + beta
H = 21

cv_sq = omega + alpha * returns**2 + beta * res.conditional_volatility**2
geo_sum = (1 - p**H) / (1 - p)
cum_var_forecast = H * sigma_L2 + (cv_sq - sigma_L2) * geo_sum
expected_vol = np.sqrt(252 / H * cum_var_forecast)

plt.plot(vix - expected_vol)
plt.title('Volatility Premium')
plt.show()

The volatility premium, also called the variance risk premium, is the compensation investors demand for bearing volatility risk. Because investors pay above the expected cost of hedging to insure against sharp volatility spikes, the premium is typically positive. Bollerslev et al. (2009) show that it predicts stock returns at quarterly horizons, though the result is sensitive to how expected realized variance is measured — it holds when using high-frequency realized variance but not with cruder proxies such as GARCH conditional variance. Drechsler and Yaron (2011) provide a theoretical foundation within a long-run risks framework.

Drechsler, Itamar, and Amir Yaron. 2011. “What’s Vol Got to Do with It.” Review of Financial Studies 24 (1): 1–45.

Predicting Returns with the Volatility Premium

Bollerslev et al. (2009) find that the variance risk premium predicts future stock returns. Inspired by their approach, we measure the premium once per month at month-end and run predictive regressions at horizons \(h = 1, 3, 6\) months: \[ r_{t \to t+h} = a + b \cdot \text{VP}_t + \varepsilon_{t+h}. \] We consider two definitions of \(\text{VP}_t\): the volatility premium \(\text{VIX}_t - \hat{\sigma}_t\) in annualized percentage points, and the variance premium \(\text{VIX}_t^2 - \hat{\sigma}_t^2\) in annualized percentage points squared. At horizons \(h > 1\) consecutive monthly observations share \(h - 1\) overlapping return months, so we use Newey-West standard errors with \(h - 1\) lags.

import statsmodels.formula.api as smf

price_monthly    = sp500['Close'].resample('ME').last()
vp_vol_monthly   = (vix - expected_vol).resample('ME').last()
vp_var_monthly   = (vix**2 - expected_vol**2).resample('ME').last()

def run_reg(premium, h):
    fwd_ret = (price_monthly.shift(-h) / price_monthly - 1) * 100
    df = pd.DataFrame({'vp': premium, 'fwd_ret': fwd_ret}).dropna()
    return smf.ols('fwd_ret ~ vp', data=df).fit(cov_type='HAC', cov_kwds={'maxlags': max(h - 1, 0)})

specs = {}
for h, lbl in [(1, '1m'), (3, '3m'), (6, '6m')]:
    specs[f'Vol ({lbl})'] = run_reg(vp_vol_monthly, h)
    specs[f'Var ({lbl})'] = run_reg(vp_var_monthly, h)

def fmt_coef(res, var):
    if var not in res.params:
        return ('', '')
    p     = res.pvalues[var]
    stars = '***' if p < 0.01 else '**' if p < 0.05 else '*' if p < 0.10 else ''
    coef  = f"{res.params[var]:.3f}{stars}"
    se    = f"({res.bse[var]:.3f})"
    return (coef, se)

coef_vars    = ['Intercept', 'vp']
index_labels = [lbl for v in coef_vars for lbl in (v, '')]

table = {}
for name, res in specs.items():
    col = [cell for v in coef_vars for cell in fmt_coef(res, v)]
    table[name] = col

df_table = pd.DataFrame(table, index=index_labels)
df_table.loc['N']  = {name: str(int(res.nobs))    for name, res in specs.items()}
df_table.loc['R²'] = {name: f"{res.rsquared:.3f}" for name, res in specs.items()}

df_table.columns = pd.MultiIndex.from_tuples(
    [(h, m) for h in ['1 month', '3 month', '6 month'] for m in ['Vol', 'Var']]
)

display(df_table)
1 month 3 month 6 month
Vol Var Vol Var Vol Var
Intercept 0.376 0.533 1.356* 1.671** 3.502*** 3.914***
(0.368) (0.345) (0.790) (0.737) (1.308) (1.253)
vp 0.129 0.002 0.320** 0.006 0.427* 0.008
(0.095) (0.002) (0.160) (0.004) (0.238) (0.005)
N 397 397 395 395 392 392
0.014 0.015 0.029 0.040 0.024 0.033

The volatility premium carries a positive and significant coefficient at the 3- and 6-month horizons, consistent with the risk-premium interpretation: a wider gap between implied and expected realized volatility signals elevated investor fear and predicts above-average future returns as that fear unwinds. The 1-month horizon is weaker, which is expected — the premium is a slow-moving risk signal, not a short-term timing indicator.

The variance premium columns tell a different story: squaring both series does not improve and often worsens predictive performance. Note that Bollerslev et al. (2009)’s benchmark specification actually uses the variance premium (VIX² minus high-frequency realized variance), and it succeeds. What fails here is the GARCH proxy specifically: replacing high-frequency realized variance with GARCH conditional variance attenuates the signal, and squaring amplifies that attenuation.

Bollerslev, Tim, George Tauchen, and Hao Zhou. 2009. “Expected Stock Returns and Variance Risk Premia.” Review of Financial Studies 22 (11): 4463–92.

The 3-month volatility premium specification is our benchmark — it corresponds most closely to the quarterly horizon emphasized by Bollerslev et al. (2009) and delivers the clearest signal in our data. We report its full output below.

print(specs['Vol (3m)'].summary(slim=True))
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                fwd_ret   R-squared:                       0.029
Model:                            OLS   Adj. R-squared:                  0.026
No. Observations:                 395   F-statistic:                     3.994
Covariance Type:                  HAC   Prob (F-statistic):             0.0463
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      1.3562      0.790      1.717      0.086      -0.192       2.905
vp             0.3200      0.160      1.999      0.046       0.006       0.634
==============================================================================

Notes:
[1] Standard Errors are heteroscedasticity and autocorrelation robust (HAC) using 2 lags and without small sample correction
h = 3
fwd_ret = (price_monthly.shift(-h) / price_monthly - 1) * 100
df_plot = pd.DataFrame({'vp': vp_vol_monthly, 'fwd_ret': fwd_ret}).dropna()

sns.regplot(x='vp', y='fwd_ret', data=df_plot, scatter_kws={'alpha': 0.4, 's': 20})
plt.xlabel('Volatility Premium (annualized pp)')
plt.ylabel('3-Month Forward Return (%)')
plt.title('Volatility Premium vs. 3-Month Forward Return')
plt.show()

The coefficient on vp is positive and statistically significant, confirming that a higher volatility premium at month-end predicts above-average S&P 500 returns over the subsequent three months. The R² is low, but this is entirely standard in return predictability regressions: equity returns are dominated by unforecastable noise, and even the best-established predictors — the dividend yield, the term spread, the consumption-wealth ratio — rarely explain more than a few percent of monthly or quarterly return variation in sample. An R² of 2–5% at a quarterly horizon is considered economically meaningful in this literature, because even a small predictable component can translate into large Sharpe ratio improvements in a portfolio context. The Newey-West correction accounts for the two overlapping return months shared by consecutive observations, so the reported standard errors are heteroskedasticity- and autocorrelation-consistent.

A final caveat: this exercise is primarily pedagogical, designed to illustrate how the estimated GARCH volatility can be combined with the VIX to construct a volatility premium. Because the GARCH model is estimated on the full sample, the parameters \(\hat\omega\), \(\hat\alpha\), and \(\hat\beta\) embed information from the entire history, including periods that would not have been available to a real-time investor at any given date. This introduces a mild look-ahead bias in the expected volatility series and, by extension, in the premium. Eliminating it properly requires re-estimating the GARCH model recursively on an expanding window — keeping only data up to each month-end before computing the forecast — which is straightforward but computationally more demanding.

Practice Problems

Problem 1 Generate a plot of the GARCH volatility for the NASDAQ Composite (^IXIC) over the same period.

Solution
start_date = '1993-01-01'
sp500 = yf.download('^IXIC', start=start_date, auto_adjust=False, progress=False, multi_level_index=False)
returns = 100 * sp500['Close'].pct_change().dropna()

am = arch_model(returns)
res = am.fit(update_freq=0)

plt.plot(res.conditional_volatility*np.sqrt(252), label='GARCH Volatility')
plt.title('Nasdaq Volatility')
plt.legend()
plt.show()
Optimization terminated successfully    (Exit mode 0)
            Current function value: 13304.219737363164
            Iterations: 12
            Function evaluations: 72
            Gradient evaluations: 12