Numerical Optimization in Python

Why Numerical Optimization?

Numerical optimization is one of the most widely used tools in quantitative disciplines. The central idea is simple: given an objective function, find the input values that make it as small (or as large) as possible.

Minimization is everywhere. Every time you fit a statistical model or run a machine learning algorithm, something is being minimized under the hood. Linear regression finds coefficients that minimize the sum of squared residuals; neural networks minimize a loss function by adjusting parameters via gradient descent.

In finance, minimization appears whenever we seek the portfolio that achieves the lowest risk for a given level of expected return. We will use it in the next notebook to maximize the Sharpe ratio — which, as we will see, is itself just a minimization problem in disguise.

When the objective is simple enough, calculus gives a closed-form solution: set the derivative to zero and solve. For more complex objectives we turn to a numerical optimizer that searches for the solution iteratively. Python’s scipy library provides one via scipy.optimize.minimize.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.optimize import minimize

sns.set_theme()

A Simple Minimization Example

Consider the function \(g(x) = x^{2} - 4x + 6.\)

def g(x):
    return x**2 - 4*x + 6

x = np.linspace(-1, 5, 100)
plt.plot(x, g(x))
plt.xlabel('x')
plt.ylabel('g(x)')
plt.title('Minimization Example')
plt.show()

The plot suggests the function has a minimum near \(x = 2.\) We can confirm this analytically: \(g'(x) = 2x - 4 = 0\) gives \(x = 2.\) We now verify numerically.

minimize takes an objective function and an initial guess \(x_0\). The optimizer starts there and searches for a nearby minimum.

x0 = 0
res = minimize(g, x0)
res.x
array([2.00000002])

The minimizer is stored in res.x. The optimizer recovers \(x = 2\) exactly, confirming the analytical result.

Maximization via Negation

scipy.optimize.minimize only minimizes — there is no maximize function. This is not a limitation in practice: maximizing \(f(x)\) is identical to minimizing \(-f(x)\). The two problems share the same solution; only the sign of the objective value differs.

Consider \(f(x) = -4 x^{2} + 4 x + 10.\)

def f(x):
    return -4*x**2 + 4*x + 10

x = np.linspace(-2, 3, 50)
plt.plot(x, f(x))
plt.xlabel('x')
plt.ylabel('f(x)')
plt.title('Maximization Example')
plt.show()

We negate \(f\) inline as a lambda and minimize from \(x_0 = 0.\)

x0 = 0
res = minimize(lambda x: -f(x), x0)
res.x
array([0.5])

The function achieves its maximum at \(x = 0.5.\) We will rely on exactly this trick in the next notebook to find the portfolio that maximizes the Sharpe ratio.

Estimating Beta by Minimizing the Sum of Squares

The examples above are purely mathematical. We now apply the same idea to a real financial problem: estimating the market beta of AAPL.

In the Capital Asset Pricing Model (CAPM), the return of a stock is modeled as a linear function of the market return, \[ r_{\text{AAPL},t} = \alpha + \beta \, r_{\text{SPY},t} + \varepsilon_t, \] where \(\alpha\) is the intercept, \(\beta\) measures how much AAPL moves with the market, and \(\varepsilon_t\) is an idiosyncratic error term.

The standard way to estimate \(\alpha\) and \(\beta\) is ordinary least squares (OLS), which chooses the parameter values that minimize the sum of squared residuals, \[ \text{SSR}(\alpha, \beta) = \sum_{t=1}^{T} \left( r_{\text{AAPL},t} - \alpha - \beta \, r_{\text{SPY},t} \right)^{2}. \] This is exactly the kind of problem minimize is designed for. OLS is linear regression — the simplest possible machine learning model — and training it is a minimization problem.

Getting the Data

We download monthly returns for AAPL and SPY over a twenty-year window.

start_date = '2004-12-01'
end_date   = '2025-01-01'

ret = (yf
       .download(['AAPL', 'SPY'], start=start_date, end=end_date,
                 auto_adjust=False, progress=False)['Adj Close']
       .resample('ME')
       .last()
       .pct_change()
       .dropna()
       )

Defining the Objective

We extract the return series as NumPy arrays and define the SSR as a function of a two-element parameter vector params = [alpha, beta].

r_aapl = ret['AAPL'].values
r_spy  = ret['SPY'].values

def ssr(params):
    alpha, beta = params
    residuals = r_aapl - alpha - beta * r_spy
    return (residuals**2).sum()

Minimizing

We start from an initial guess of \(\alpha_0 = 0\) and \(\beta_0 = 1\) — a reasonable prior for a large-cap stock — and let the optimizer find the SSR-minimizing values.

x0 = [0, 1]
res = minimize(ssr, x0)
alpha_hat, beta_hat = res.x
print(f'alpha = {alpha_hat:.4f}')
print(f'beta  = {beta_hat:.4f}')
alpha = 0.0164
beta  = 1.2218

Verification

OLS has a well-known closed-form solution. For the simple linear regression of \(y\) on \(x\), the slope is \[ \hat\beta = \frac{\text{Cov}(r_{\text{AAPL}},\, r_{\text{SPY}})}{\text{Var}(r_{\text{SPY}})}. \] We can compute this directly from the data and check that both approaches agree.

beta_formula = ret.cov().loc['AAPL', 'SPY'] / ret['SPY'].var()
print(f'beta (formula) = {beta_formula:.4f}')
print(f'beta (minimize) = {beta_hat:.4f}')
beta (formula) = 1.2218
beta (minimize) = 1.2218

Both methods produce the same estimate: minimizing the sum of squares numerically is equivalent to applying the OLS formula analytically. The numerical approach is more flexible, however — in the next notebook we will minimize objectives for which no closed-form solution exists.

Practice Problems

Problem 1 For which values of \(x\) and \(y\) the function \(\psi(x, y) = (x + 2y - 7)^{2} + (2x + y - 5)^{2}\) attains its minimum?

Solution
def psi(x):
    return (x[0] + 2*x[1] - 7)**2 + (2*x[0] + x[1] - 5)**2

x0 = [0, 0]

res = minimize(psi, x0)
res.x
array([1., 3.])
The function is minimized at \(x = 1\) and \(y = 3.\)

Problem 2 For which values of \(x\) and \(y\) the function \(\phi(x,y) = \left( 1.5 - x + xy \right)^{2} + \left( 2.25 - x + xy^{2}\right)^{2}\) attains its minimum?

Solution
def phi(x):
    return (1.5 - x[0] + x[0]*x[1])**2 + (2.25 - x[0] + x[0]*x[1]**2)**2

x0 = [0, 0]

res = minimize(phi, x0)
res.x
array([2.99999901, 0.49999979])
The function is minimized at \(x = 3\) and \(y = 0.5.\)