Section 5.4 — Bayesian linear models#

This notebook contains the code examples from Section 5.4 Bayesian linear models from the No Bullshit Guide to Statistics.

See also examples in:

Notebook setup#

# load Python modules
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Figures setup
plt.clf()  # needed otherwise `sns.set_theme` doesn"t work
from plot_helpers import RCPARAMS
RCPARAMS.update({"figure.figsize": (5, 3)})   # good for screen
# RCPARAMS.update({"figure.figsize": (5, 1.6)})  # good for print
sns.set_theme(
    context="paper",
    style="whitegrid",
    palette="colorblind",
    rc=RCPARAMS,
)

# High-resolution please
%config InlineBackend.figure_format = "retina"

# Where to store figures
DESTDIR = "figures/bayesian/lms"
<Figure size 640x480 with 0 Axes>
# set random seed for repeatability
np.random.seed(42)

Simple linear regression using PyMC#

import pymc as pm
import numpy as np
import arviz as az

# Simulated data
np.random.seed(42)
x = np.random.normal(0, 1, 100)
y = 3 + 2 * x + np.random.normal(0, 1, 100)

# Bayesian Linear Regression Model
with pm.Model() as model:
    # Priors
    beta0 = pm.Normal("beta0", mu=0, sigma=10)
    beta1 = pm.Normal("beta1", mu=0, sigma=10)
    sigma = pm.HalfNormal("sigma", sigma=1)
    
    # Likelihood
    mu = beta0 + beta1 * x
    y_obs = pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
    
    # Sampling
    trace = pm.sample(2000, return_inferencedata=True)
WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta0, beta1, sigma]
100.00% [6000/6000 00:02<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 2 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics

Summary using mean#

# Posterior Summary
summary = az.summary(trace, kind="stats")
summary
mean sd hdi_3% hdi_97%
beta0 3.007 0.098 2.827 3.200
beta1 1.857 0.109 1.656 2.062
sigma 0.957 0.070 0.822 1.085

Summary using median as focus statistic#

ETI = Equal-Tailed Interval

az.summary(trace, stat_focus="median", kind="stats")
median mad eti_3% eti_97%
beta0 3.009 0.064 2.814 3.190
beta1 1.858 0.074 1.653 2.060
sigma 0.953 0.046 0.836 1.103
# Plotting posterior
az.plot_posterior(trace, point_estimate="mean", round_to=3);
../_images/1b871f283070f51ebe86cf2322126df2c9e3b1e22ae084224fc6544358098610.png