Permutation tests for comparing two groups¶
Notebook setup¶
In [1]:
Copied!
# Install stats library
%pip install --quiet ministats
# Install stats library
%pip install --quiet ministats
[notice] A new release of pip is available: 24.3.1 -> 25.0.1 [notice] To update, run: pip install --upgrade pip Note: you may need to restart the kernel to use updated packages.
In [2]:
Copied!
# Figures setup
import matplotlib.pyplot as plt
import seaborn as sns
plt.clf() # needed otherwise `sns.set_theme` doesn't work
sns.set_theme(
style="whitegrid",
rc={'figure.figsize': (7, 2)},
)
# High-resolution figures please
%config InlineBackend.figure_format = 'retina'
def savefig(fig, filename):
fig.tight_layout()
fig.savefig(filename, dpi=300, bbox_inches="tight", pad_inches=0)
# Figures setup
import matplotlib.pyplot as plt
import seaborn as sns
plt.clf() # needed otherwise `sns.set_theme` doesn't work
sns.set_theme(
style="whitegrid",
rc={'figure.figsize': (7, 2)},
)
# High-resolution figures please
%config InlineBackend.figure_format = 'retina'
def savefig(fig, filename):
fig.tight_layout()
fig.savefig(filename, dpi=300, bbox_inches="tight", pad_inches=0)
<Figure size 640x480 with 0 Axes>
Introduction¶
Permutation test¶
TODO: import from blog post and slides
Suppose we have a obtained samples from group of students who took the smart drug treated
,
and a similar group who didn't take the smart drug controls
.
In [3]:
Copied!
# data
treated = [92.69, 117.15, 124.79, 100.57, 104.27, 121.56, 104.18,
122.43, 98.85, 104.26, 118.56, 138.98, 101.33, 118.57,
123.37, 105.9, 121.75, 123.26, 118.58, 80.03, 121.15,
122.06, 112.31, 108.67, 75.44, 110.27, 115.25, 125.57,
114.57, 98.09, 91.15, 112.52, 100.12, 115.2, 95.32,
121.37, 100.09, 113.8, 101.73, 124.9, 87.83, 106.22,
99.97, 107.51, 83.99, 98.03, 71.91, 109.99, 90.83, 105.48]
controls = [85.1, 84.05, 90.43, 115.92, 97.64, 116.41, 68.88, 110.51,
125.12, 94.04, 134.86, 85.0, 91.61, 69.95, 94.51, 81.16,
130.61, 108.93, 123.38, 127.69, 83.36, 76.97, 124.87, 86.36,
105.71, 93.01, 101.58, 93.58, 106.51, 91.67, 112.93, 88.74,
114.05, 80.32, 92.91, 85.34, 104.01, 91.47, 109.2, 104.04,
86.1, 91.52, 98.5, 94.62, 101.27, 107.41, 100.68, 114.94,
88.8, 121.8]
# data
treated = [92.69, 117.15, 124.79, 100.57, 104.27, 121.56, 104.18,
122.43, 98.85, 104.26, 118.56, 138.98, 101.33, 118.57,
123.37, 105.9, 121.75, 123.26, 118.58, 80.03, 121.15,
122.06, 112.31, 108.67, 75.44, 110.27, 115.25, 125.57,
114.57, 98.09, 91.15, 112.52, 100.12, 115.2, 95.32,
121.37, 100.09, 113.8, 101.73, 124.9, 87.83, 106.22,
99.97, 107.51, 83.99, 98.03, 71.91, 109.99, 90.83, 105.48]
controls = [85.1, 84.05, 90.43, 115.92, 97.64, 116.41, 68.88, 110.51,
125.12, 94.04, 134.86, 85.0, 91.61, 69.95, 94.51, 81.16,
130.61, 108.93, 123.38, 127.69, 83.36, 76.97, 124.87, 86.36,
105.71, 93.01, 101.58, 93.58, 106.51, 91.67, 112.93, 88.74,
114.05, 80.32, 92.91, 85.34, 104.01, 91.47, 109.2, 104.04,
86.1, 91.52, 98.5, 94.62, 101.27, 107.41, 100.68, 114.94,
88.8, 121.8]
To compare the two groups, we'll subtract the average score computed from each group.
In [4]:
Copied!
def mean(sample):
return sum(sample) / len(sample)
def dmeans(xsample, ysample):
dhat = mean(xsample) - mean(ysample)
return dhat
# Calculate the observed difference between means
dscore = dmeans(treated, controls)
dscore
def mean(sample):
return sum(sample) / len(sample)
def dmeans(xsample, ysample):
dhat = mean(xsample) - mean(ysample)
return dhat
# Calculate the observed difference between means
dscore = dmeans(treated, controls)
dscore
Out[4]:
7.8870000000000005
Statistical question?¶
Are the two groups the same? This is equivalent to saying the smart drug had no effect.
Disproving the skeptical colleague¶
We'll now use the 10000
permutations of the original data
to obtain sampling distribution of the difference between means under the null hypothesis.
In [6]:
Copied!
import numpy as np
np.random.seed(43)
pdhats = []
for i in range(0, 10000):
all_iqs = np.concatenate((treated, controls))
pall_iqs = np.random.permutation(all_iqs)
ptreated = pall_iqs[0:len(treated)]
pcontrols = pall_iqs[len(treated):]
pdhat = dmeans(ptreated, pcontrols)
pdhats.append(pdhat)
import numpy as np
np.random.seed(43)
pdhats = []
for i in range(0, 10000):
all_iqs = np.concatenate((treated, controls))
pall_iqs = np.random.permutation(all_iqs)
ptreated = pall_iqs[0:len(treated)]
pcontrols = pall_iqs[len(treated):]
pdhat = dmeans(ptreated, pcontrols)
pdhats.append(pdhat)
Compute the p-value of the observed difference between means dprice
under the null hypothesis.
In [7]:
Copied!
tails = [d for d in pdhats if abs(d) > dscore]
pvalue = len(tails) / len(pdhats)
pvalue
tails = [d for d in pdhats if abs(d) > dscore]
pvalue = len(tails) / len(pdhats)
pvalue
Out[7]:
0.0101
In [8]:
Copied!
bins = np.arange(-11, 11, 0.3)
# plot the sampling distribution in blue
ax = sns.histplot(pdhats, bins=bins)
# plot red line for the observed statistic
plt.axvline(dscore, color="red")
# plot the values that are equal or more extreme in red
sns.histplot(tails, ax=ax, bins=bins, color="red")
ax.set_xlabel("$d$")
ax.set_ylabel("$f_{D}$")
savefig(plt.gcf(), "figures/pvalue_viz_permutation_test_iqs.png")
bins = np.arange(-11, 11, 0.3)
# plot the sampling distribution in blue
ax = sns.histplot(pdhats, bins=bins)
# plot red line for the observed statistic
plt.axvline(dscore, color="red")
# plot the values that are equal or more extreme in red
sns.histplot(tails, ax=ax, bins=bins, color="red")
ax.set_xlabel("$d$")
ax.set_ylabel("$f_{D}$")
savefig(plt.gcf(), "figures/pvalue_viz_permutation_test_iqs.png")
Alternative using formula¶
In [9]:
Copied!
from ministats import ttest_dmeans
ttest_dmeans(treated, controls)
from ministats import ttest_dmeans
ttest_dmeans(treated, controls)
Out[9]:
0.010163611652137501
Conclusion¶
Links¶
In [ ]:
Copied!