Module 2
Working with Financial Data
Overview
This module develops the core Python skills for working with stock price data in an investment context. Starting from the definition of a net return, we build up to monthly return series, dividend extraction, distributional analysis, and ultimately to the estimation of systematic risk through the CAPM.
A central theme throughout is the structure of financial data as provided by Yahoo Finance: how adjusted prices incorporate dividends and splits, how resampling to monthly frequency requires care about timing conventions, and how to align multiple data series for cross-sectional analysis. The module closes with an introduction to OLS regression using statsmodels, applied to estimating stock betas for Microsoft and Tesla against the S&P 500.
Topics
Computing Returns
- The definition of net returns in terms of adjusted and unadjusted prices and dividends
- Computing daily returns from adjusted close prices using
pct_change() - Aggregating to monthly frequency with
resample('ME').last()and the convention differences between Yahoo Finance and academic research - Constructing lagged return variables with
shift()to examine return autocorrelation - Visualizing return distributions with time-series plots, histograms, and kernel density estimates
Computing Dividends
- The theoretical relationship between adjusted and unadjusted close prices and dividend payments
- Deriving implied dividend payments from that price relationship in Python
- Verifying the derived dividends against dividend data reported directly by Yahoo Finance
Descriptive Statistics
- Downloading and aligning return data for multiple stocks simultaneously
- Summarizing the return distribution with
describe(): mean, standard deviation, and quantiles - Comparing return distributions across securities with histograms
- Measuring co-movement between stocks using correlation matrices
Beta Estimation
- Using SPY (the S&P 500 ETF) as a proxy for the market portfolio
- Constructing excess returns by subtracting the 13-week Treasury Bill yield (
^IRX) as the risk-free rate - Estimating the CAPM regression for MSFT, AAPL, and TSLA using
statsmodels.formula.api - Interpreting the estimated alpha and beta coefficients
- Plotting the Security Characteristic Line (SCL) to visualize the linear relationship between a stock’s excess return and the market’s excess return