Appendix E — Seaborn tutorial#

See outline here:
https://docs.google.com/document/d/1fwep23-95U-w1QMPU31nOvUnUXE2X3s_Dbk5JuLlKAY/edit#bookmark=id.3i7cktuf1u3i

In this tutorial, we’ll learn about Seaborn data visualizations. We’ll discuss Seaborn plot functions We’ll also describe the various options for customize plots’ the appearance, add annotations, and export plots as publication-quality images.

If you want to pursue a career in a data-related field, I highly recommend you get to know Seaborn by reading this tutorial and the other resources in the links section.

Seaborn overview#

The Seaborn library is a powerful toolbox for generating statistical data visualizations. Seaborn makes it very easy to visualize data stored in Pandas data frames. You can generate standard statistical plots like barplots, stripplots, scatterplots, using a single line of code. We’ll look at a few examples of the Seaborn functions for generating statistical visualizations of data stored in Pandas data frames. The combination of the JupyterLab computational environment and the Python libraries Pandas and Seaborn is a best-in-class toolset for doing statistics in Python.

Seaborn includes numerous plot functions like stripplot, scatterplot, histplot, boxplot, barplot, and countplot. In this subsection, we’ll show some examples of these function.

Basic plots#

Line plot#

import seaborn as sns
import pandas as pd
days = [1, 2, 3, 4]
cakes = [2, 5, 3, 4]
sns.lineplot(x=days, y=cakes);
../_images/96d30727b1aff3621d97227c87cd8d256d40edc38d76fc2025eeba6159871eca.png
# # (optional) use Matplotlib axis methods to add labels
# ax = sns.lineplot(x=days, y=cakes)
# ax.set_xlabel("days")
# ax.set_ylabel("cakes")
df = pd.DataFrame({"days":days, "cakes":cakes})
df
days cakes
0 1 2
1 2 5
2 3 3
3 4 4
df.columns
Index(['days', 'cakes'], dtype='object')
sns.lineplot(x="days", y="cakes", data=df);
../_images/ea2331d5b4838b37012c31183e8d6dfd80cbe546cb98b922064fa294f780c450.png
# # ALT. hybrid approach
# sns.lineplot(x=df["days"], y=df["cakes"])

Plotting function graphs#

def g(x):
    return 0.5 * x**2
import numpy as np
xs = np.linspace(0, 10, 1000)
gxs = g(xs)
sns.lineplot(x=xs, y=gxs, label="Graph of g(x)");
../_images/198fc0699c6458d50822b411199122f23a2ea032ad59af2836f3f604bbe603cd.png
# # FIGURES ONLY
# from ministats.utils import savefigure
# ax = sns.lineplot(x=xs, y=gxs, label="Graph of g(x)");
# filename = "figures/tutorials/seaborn/graph_of_function_g_eq_halfx2.pdf"
# savefigure(ax, filename)

Distribution plots#

Strip plots#

Scatter plots#

Density plots#

Histograms#

Box plots#

Violin plots#

Categorical plots#

Bar plots#

Linear model plots#

Linear model plots using seaborn#

Linear model plots from scratch#

Linear model plots using statsmodels#

Other plots#

Stem plot for discrete random variables#

Customizing plots#

Bonus topics#

Data visualization tips#