Diversification and the Investment Opportunity Set

The Investment Opportunity Set

We will now analyze the different portfolios obtained by combining two risky assets \(A\) and \(B\). Let’s denote by \(\mu_{i}\) and \(\sigma_{i}\) the expected return and volatility of asset \(i = A, B\). Furthermore, denote by \(\sigma_{A,B}\) the covariance between \(A\) and \(B\).

Consider now a portfolio \(P\) in which we invest \(w_{A}\) in \(A\) and \(w_{B}\) in \(B\). Of course, we must have \(w_{A} + w_{B} = 1\). Therefore: \[ \begin{align*} \mu_{P} & = w_{A} \mu_{A} + w_{B} \mu_{B}, \\ \sigma_{P}^{2} & = w_{A}^{2} \sigma_{A}^{2} + w_{B}^{2} \sigma_{B}^{2} + 2 w_{A} w_{B} \sigma_{A, B}. \end{align*} \] The expected return of portfolio \(P\) is just the weighted average of the expected returns of \(A\) and \(B\). The variance of the portfolio, though, is also influenced by the covariance between the individual assets. This is what we call diversification.

Getting the Data

import yfinance as yf
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns

sns.set_theme()

We download and compute monthly returns for AAPL and WMT starting in January 2005 until January 2025.

start_date = '2004-12-01'
end_date = '2025-01-01'

df = (yf
      .download(['AAPL', 'WMT'], start=start_date, end=end_date, auto_adjust=False, progress=False)['Adj Close']
      .resample('ME')
      .last()
      .pct_change()
      .dropna()
      )

Estimating Expected Returns

We estimate the expected return of each stock as the annualized average of its monthly returns. If \(\bar{r}\) denotes the sample mean of monthly returns, the annualized expected return is simply \[ \mu = 12 \times \bar{r}. \]

mu_A = df['AAPL'].mean() * 12
mu_B = df['WMT'].mean() * 12

We can store the expected returns of both stocks in a Pandas dataframe.

mu_table = 100 * pd.DataFrame(data={'Expected Return (%)': [mu_A, mu_B]}, index=['A', 'B']).round(4)
mu_table.index.name = 'Asset'
display(mu_table)
Expected Return (%)
Asset
A 33.00
B 11.76

Estimating Standard Deviations and Covariance

If you have 12 independent monthly returns \(r_{1}, r_{2}, \ldots, r_{12}\) with the same variance \(\sigma^{2},\) the variance of the sum is the sum of the individual variances, \[ \mathsf{V}\left( r_{1} + r_{2} + \ldots + r_{12} \right) = \mathsf{V}(r_{1}) + \mathsf{V}(r_{2}) + \ldots + \mathsf{V}(r_{12}) = 12 \sigma^{2} \]

Therefore, the annualized standard deviation of returns can be computed directly from the standard deviation of monthly returns so that, \[ \sigma_{\text{annualized}} = \sqrt{12} \times \sigma_{\text{monthly}} \]

sigma_A = df['AAPL'].std() * np.sqrt(12)
sigma_B = df['WMT'].std() * np.sqrt(12)
sigma_table = 100 * pd.DataFrame(data={'Standard Deviation (%)': [sigma_A, sigma_B]}, index=['A', 'B']).round(4)
sigma_table.index.name = 'Asset'
display(sigma_table)
Standard Deviation (%)
Asset
A 31.21
B 17.36

We could estimate the covariance directly from data using .cov(), but correlations are easier to interpret since they are always bounded between \(-1\) and \(1\). We therefore display the correlation matrix and then recover the annualized covariance using \[ \sigma_{AB} = \sigma_{A} \sigma_{B} \rho_{AB}, \] where \(\rho_{AB}\) denotes the correlation between A and B. Since the standard deviations are already annualized and the correlation coefficient is dimensionless, the resulting covariance is automatically annualized.

corr_matrix = df.corr()
display(corr_matrix)
rho_AB = corr_matrix.loc['AAPL', 'WMT']
sigma_AB = sigma_A * sigma_B * rho_AB
Ticker AAPL WMT
Ticker
AAPL 1.000000 0.154267
WMT 0.154267 1.000000

Plotting the Investment Opportunity Set

We now plot the investment opportunity set. We create a NumPy array w_A of portfolio weights for AAPL ranging from -100% to 200%. Note that here we allow for short-sales for both assets. The corresponding weights for WMT are w_B = 1 - w_A. The expected return and standard deviations of the corresponding portfolios are computed using the expressions defined above.

w_A = np.linspace(-1, 2, 50)
w_B = 1 - w_A
mu_P = w_A * mu_A + w_B * mu_B
sigma_P = np.sqrt(w_A**2 * sigma_A**2 + w_B**2 * sigma_B**2 + 2 * w_A * w_B * sigma_AB)

We can now plot the resulting investment opportunity set.

plt.plot(sigma_P, mu_P)
plt.scatter([sigma_A, sigma_B], [mu_A, mu_B])
plt.xlabel('Standard Deviation')
plt.ylabel('Expected Return')
plt.title('Investment Opportunity Set')
plt.annotate('AAPL', [sigma_A, mu_A], [sigma_A + 0.04, mu_A - 0.005])
plt.annotate('WMT', [sigma_B, mu_B], [sigma_B + 0.04, mu_B - 0.005])
plt.show()

The upper-part of the investment opportunity set is called the efficient frontier. If investors are rational and risk-averse, they would choose a portfolio that achieves the highest expected return for a given level of risk.

Practice Problems

Problem 1 A pension fund manager is considering three mutual funds. The first is a stock fund, the second is a long-term government and corporate bond fund, and the third is a T-bill money market fund that yields a rate of 8%. The probability distribution of the risky funds is as follows:

Fund Expected Return Standard Deviation
Stock fund (S) 20% 30%
Bond fund (B) 12% 15%

The correlation between the fund returns is 0.10. Tabulate the investment opportunity set of the two risky funds in a table like the one shown below.

Portfolio Proportion in Stock Fund Proportion in Bond Fund Expected Return Standard Deviation
A 0% 100%
B 20% 80%
C 40% 60%
D 60% 40%
E 80% 20%
F 100% 0%

Which portfolio achieves the minimum standard deviation?

Solution
mu_stock = 0.20
mu_bond = 0.12

sigma_stock = 0.30
sigma_bond = 0.15

correlation = 0.10
covariance = sigma_stock*sigma_bond*correlation

w_stock = np.linspace(0, 1, 6)
w_bond = 1 - w_stock
mu_portfolio = w_stock * mu_stock + w_bond * mu_bond
sigma_portfolio = np.sqrt(w_stock**2 * sigma_stock**2 + w_bond**2 * sigma_bond**2 + 2 * w_stock * w_bond * covariance)

my_dict = {'Proportion in Stock Fund (%)': 100*w_stock,
           'Proportion in Bond Fund (%)': 100*w_bond,
           'Expected Return (%)': 100*mu_portfolio.round(4),
           'Standard Deviation (%)': 100*sigma_portfolio.round(4)}
index = ['A', 'B', 'C', 'D', 'E', 'F']
result = pd.DataFrame(data=my_dict, index=index)
result.index.name = 'Portfolio'
result
Proportion in Stock Fund (%) Proportion in Bond Fund (%) Expected Return (%) Standard Deviation (%)
Portfolio
A 0.0 100.0 12.0 15.00
B 20.0 80.0 13.6 13.94
C 40.0 60.0 15.2 15.70
D 60.0 40.0 16.8 19.53
E 80.0 20.0 18.4 24.48
F 100.0 0.0 20.0 30.00
We can see from the table that portfolio B achieves the minimum standard deviation.