Appendix C — Python tutorial#

Click this binder button to run this notebooks interactively: Binder (highly recommended, for a hands-on experience, which is much better for learning).

Python as a fancy calculator#

In this tutorial I’ll show you the basics of the Python programming language. Don’t freak out, this is not a big deal. You’re not going to become a programmer or anything, you’re just learning to use Python as a calculator.

Calculator analogy#

Python commands are similar to the commands you give to a calculator. The same way a calculator has different buttons for the various arithmetic operations, the Python language has a number of commands you can “run” or “execute.” Python is more powerful than a calculator because you can input multiple lines of commands at once, and write more complicated expressions.

The same way knowing how to use a calculator is very helpful for doing arithmetic operations, learning Python is very helpful when for doing arithmetic operations, but also doing more complicated multi-step calculations.

Why learn python?#

Staying with the python-is-a-calculator analogy, I’d like to give you an idea of some of the things you can do when you use Python as a calculator.

  • Python is really good as a “basic” calculator, which allows you to do calculate any arithmetic expression involving math operations like +, -, *, /, pow, and more generally any math function.

  • Python programs allows you to represent procedures with multiple steps.

  • Python is also very useful as a graphical calculator, you can plot functions, and visualize data distributions.

  • Python is an extensible, programmable calculator, which means you can define custom functions that are useful for any given domain.

  • Python provides lots of extensions (modules) for scientific computations. You can do advanced linear algebra using numpy module, carry out optimization using scipy, and use sympy for symbolic math calculations. Very powerful stuff.

All in all, learning Python will give you all kinds of options for doing math and science.

Python for statistics#

In this tutorial, you’ll learn all the Python functions needed for statistics. Learning a few basic Python constructs like the for loop will enable you to simulate probability distributions and experimentally verify how statistics procedures work. This is a really big deal! If’s good to know the statistical formula and recipes, but it’s even better when you can run your own simulations and check when the formulas work and when they fail.

Once you learn the basics of Python syntax, you’ll have access to the best-in-class tools for data management (Pandas, see pandas_tutorial.ipynb), data visualization (Seaborn, see seaborn_tutorial.ipynb), statistics (scipy and statsmodels), and machine learning (scikit-learn, pytorch, huggingface, etc.).

Don’t worry there won’t be any advanced math—just sums, products, exponents, logs, and square roots. Nothing fancy, I promise. If you’ve ever created a formula in a spreadsheet, then you’re familiar with all the operations we’ll see. In a spreadsheet formula you’d use SUM( in Python we write sum(. You see, it’s nothing fancy.

Yes, there will be a lot of code (paragraphs of Python commands) in this tutorial, but you can totally handle this. If you ever start to freak out an think “OMG this is too complicated!” remember that Python is just a fancy calculator.

Overview of the material in this tutorial#

We’ll cover all essential topics required to get to know Python, including:

  • Getting started where we’ll install JupyterLab Desktop coding environment

  • Expressions and variables: basic building blocks of any program.

  • Getting comfortable with Python: looking around and getting help.

  • Lists and for loops: repeating steps and procedures.

  • Functions are reusable code blocks.

  • Other data structures: sets, tuples, etc.

    • Boolean variables and conditional statements: conditional code execution.

    • Dictionaries are a versatile way to store data.

  • Objects and classes: creating custom objects.

  • Python grammar and syntax: review of all the syntax.

  • Python libraries and modules: learn why people say Python comes with “batteries included”

After you’re done with this tutorial, you’ll be ready to read the other two:

It’s important for you to try solving the exercises that you’ll encounter as you read along. The exercises are a great way to practice what you’ve been learning.

Getting started#

Installing JupyterLab Desktop#

JupyterLab is a platform for interactive computing that makes it easy to run Python code. You can run JupyterLab on your local computer or on a remote server (mybinder or colab). JupyterLab is based on a notebook interface that allows you to mix text, code, and graphics, which makes it the perfect tool for learning Python.

JupyterLab Desktop is a convenient all-in-one application that you can install on your computer to take advantage of everything Python has to offer for data analysis and statistics. You can download JupyterLab Desktop from the following page on GitHub: jupyterlab/jupyterlab-desktop

Choose the download link for your operating system.

After the installation completes and you launch the JupyterLab application, you should see a launcher window similar to the one shown below:

The JupyterLab interface with the File browser shown in the left sidebar. The box labelled (1) indicates the current working directory. The box labelled (2) shows the list of files in the current directory. The Launcher tab allows us to create a new notebooks, by clicking the button labelled (3).

Use the Python 3 (ipykernel) button under the to create a new Python 3 notebook, as shown in the above screenshot. A notebook consists of a sequence of cells, similar to how a text document consists of a series of paragraphs. The notebook you created currently consists of a single empty code cell, which is ready to accept Python commands. Type in the expression 2 + 3 (without the quotes) into the cell then press SHIFT+ENTER to run the code. You should see the result of the calculation displayed on a new line immediately below the input. The cursor will automatically move to a new input cell, as shown in the screenshot below.

A notebook with a code cell and its output labelled (1). The cursor is currently in the cell labelled (2). The the label (3) tells us notebook filename is Untitled.ipynb. The buttons labelled (4) control the notebook execution: run, stop, restart, and run all.

The notebook interface offers many useful features, but for now, I just want you to think of notebooks as an easy way to run Python code. Notebooks will you to try Python commands interactively, which is the best way to learn! Try some Python commands to get a feeling of how notebooks work. Remember you can click the play button in the toolbar (the first button in the box labelled (4) in the above screenshot) or press the keyboard shortcutSHIFT+ENTER to run the code.

I encourage you to play around with the notebook interface, in particular the buttons labeled (3) and (4). Try clicking on the notebook execution control buttons (4) to see what they do. The play button is equivalent to pressing SHIFT + ENTER. The stop button can be used to interrupt a computation that takes too long. The run-all button are useful when you want to re-run all the cells from scratch. This is my favourite button! I use it often to recompute all the sequence of Python commands in the current notebook, from top to bottom.

Alternatives: If you don’t want to install anything on your computer yet, you have two other options for playing with this notebook:

  • Run JupyterLab instance in the cloud via the mybinder links. Click here to launch an interactive notebook of this tutorial.

  • You can also enable the “Live Code” feature while reading this tutorial online at noBSstats.com. Use the rocket button in the top right, and choose the Live Code option to make all the cells in this notebook interactive.

Code cells contain Python commands#

The Python command prompt is where you enter Python commands. Each of the code cells in this notebook is a command prompt that allows you to enter Python commands and “run” them by pressing SHIFT + ENTER, or by clicking the play button in the toolbar.

For example, you can make Python compute the sum of two numbers by entering 2+3 in a code cell, then pressing SHIFT + ENTER.

2 + 3
5

In the above code cell, the input is the expression 2 + 3 (the sum of two integers), and the output 5 is printed below.

Let’s now compute a more complicated math expression \((1+4)^2 - 3\), which is equivalent to the following Python expression:

(1+4)**2 - 3
22

The Python syntax for math operations is identical to the notation we use in math: addition is +, subtraction is -, multiplication is *, division is /. The syntax for writing exponents is a little unusual, using two asterisks: \(a^n\) = a**n.

When you run a code cell, you’re telling the computer to evaluate the Python instructions it contains. Python then prints the result of the expression in the output cell.

Running a code cell is similar to using the EQUALS button on the calculator: whatever math expression you entered, the calculator will compute its value and display it as the output. The process is identical when you execute some Python code, but you’re allowed to input multiple lines of commands at once. The computer will execute the lines of code, one by one, in the order it sees them.

The result of the final calculation in the cell gets printed as the output of that cell. You can easily change the contents of any input cell, and re-run it to observe a new output. This interactivity makes it easy to explore the code examples.

Your turn to try this!#

Try typing in some Python code expression in this cell, then run it by pressing SHIFT + ENTER or using the Play button.

Expressions and variables#

Python programming involves computing Python expressions and manipulating the data stored in variables. It’s time to learn about Python expressions and variables, starting with expressions.

Python expressions#

Let’s start by showing a simple example of a Python expression, which computes the sum of two integers.

2+3
5

In a notebook, the last expression in the code cell will be printed automatically.

If you want to print the value of some variable, we normally need to use the print function (to be discussed later), but when working in a notebook interface, we don’t need to do that since the last statement in each code block gets printed automatically.

Here are two examples that involve list expressions and function calls (to be discussed later).

[1, 2, 3]
[1, 2, 3]
len([1, 2, 3, 6])
4
sum([1, 2, 3])
6

If you’ve every used a spreadsheet software before, you can think of the Python function sum(...) as the equivalent of the spreadsheet function SUM(...).

Variables#

Similar to variables in math, a variable in Python is a convenient name we use to refer to any value: a constant, the input to a function x, the output of a function y, or any other intermediate value. We use the assignment operator = to store values into variables.

The assignment operator#

In the above code examples, we computed the values of various expressions, but we didn’t do anything with the result. The more common pattern in Python, is to store the result of an expression into some variable.

To store the result of an expression in a variable, we use the assignment operator = as follows, from left to right:

  • we start by writing the name of the variable

  • then, we add the symbol = (which stands for assign to)

  • finally, we write an expression for the value we want to store in the variable

For example, here is the code that computes the value of the expression 2+3 and stores the result in the variable x.

x = 2+3

This expression didn’t print any output: it just assigned value 5 to a the variable x.

Note the meaning of = is not the same as in math: we’re not writing an equation, but assigning the contents of x.

Here are some other, equivalent ways to describe the assignment operation statement:

  • Store the result of 2+3 into the variable x.

  • Put 5 into x.

  • Save the value 5 into the memory location named x.

  • Define x to be equal to 5.

  • Set x to 5.

  • Record the result of 2+3 under the name x.

  • Let x be equal to 5.

To display the contents of the variable x, we can specify its name on the last line of a code cell.

x
5

We often combine the “assign 2+3 to x” and the “display x” commands into a single code cell, as shown below.

x = 2+3
x
5

Exercise 1: Imitate the above statements, to create another variable y that contains the value of the expression \((1+4)^2 - 3\), then display the contents of y.

# put you answer in this code cell
#@titlesolution
y = (1+4)**2 - 3
y
22

So to summarize, the syntax of an assignment statement is as follows:

<place> = <some expression>

The assignment operator (=) is used to store the value of the expression <some expression> into the memory location <place>, which is usually a variable name, but later on we’ll learn how to store values inside containers like lists and dictionaries.

Multi-line expressions (Python code blocks)#

Let’s now look at some longer code examples that show multiple steps of calculations, and intermediate values.

Numerical expressions#

Example: Number of seconds in one week#

Let’s say we need to find how many seconds there are in one week. We can do this using multi-step Python calculation, using the fact that there are \(60\) seconds in one minute, \(60\) minutes in one hour, and \(24\) hours in one day, and \(7\) days in one week.

secs_in_1min = 60 
secs_in_1hour = secs_in_1min * 60 
secs_in_1day = secs_in_1hour * 24
secs_in_1week = secs_in_1day * 7
secs_in_1week
604800

Note we use the underscore _ as part of the variable name, which is a common pattern in Python code. The variable name some_name is easier to read than somename.

Exercise 2: Johnny currently weights 90 kg, and wants to know his weight in pounds lbs. One kilogram is equivalent to 2.2 lbs. Write the Python expression that computes Johnny’s weight in pounds.

weight_in_kg = 107
weight_in_lb = ... # replace ... with your answer
#@titlesolution
weight_in_kg = 107
weight_in_lb = 2.2 * weight_in_kg
weight_in_lb
235.4

Exercise 3: You’re buying something that costs 57.0 dollars, and the local government imposes a 10% tax on your purchase. Calculate the total you’ll have to pay, which includes the cost and 10% taxes.

cost = 57.00
taxes = ... # replace ... with your answer
total = ... # replace ... with your answer
#@titlesolution
cost = 57.00
taxes = 0.10 * cost
total = cost + taxes
total
62.7

Exercise 4: The formula for converting a temperature from Celsius to temperature in Fahrenheit is given by \(F = \tfrac{9}{5} \cdot C + 32\). Given the variable C which specifies the current temperature in Celsius, write the expression that calculates the current temperature in Fahrenheit and store it in the variable F.

C = 20
# F = ... # (un-comment and replace ... with the correct expression)

Test: when C = 100, your answer F should be 212.

#@titlesolution
C = 20
F = (C * 9/5) + 32
F
68.0

Variable types#

Most common variables types#

There are multiple types of variables in Python:

  • int - integers ex: 34,65, 78, -4, etc. (rougly equivalent to \(\mathbb{Z}\))

  • float - ex: 4.6,78.5, 1e-3 (full name is “floating point number”; similar to \(\mathbb{R}\) but only with finite precision)

  • bool - a Boolean truth value with only two choices: True or False.

  • string - text content like "Hello", "Hello everyone". Text strings are denoted using either double quotes "Hi" or single quotes 'Hi'.

  • list a sequence of values like [61, 79, 98, 72]. The beginning and the end of the list are denoted by the brackets [ and ], and its elements are separated by commas.

  • Other types include string, dictionary, tuples, sets, functions, objects, etc. These are other useful Python building blocks which we’ll talk about in later sections.

Let’s look at some examples with variables of different types: an integer, a floating point number, a boolean value, a string, a list, and a dictionary.

score = 98
average = 77.5
above_the_average = True
message = "Hello everyone"
scores = [61, 79, 98, 72]

Running the above code cell doesn’t print anything, because we have only defined variables: score, average, above_the_average, message, scores, and profile, but not displayed any of them.

To display the value of any of these variables, we can use it’s name on the command line. Let’s see the contents the score variable:

score
98

The function type tells you the type of any variable, meaning what kind of number or object it is.

type(score)
int

In this case, value of the variable score is 98 and it is of type int (integer).

## ALT. display both value and type on the same line (as a tuple)
# score, type(score)

Let’s now look at the value and type of the variable average:

average
77.5
type(average)
float

Exercise: try displaying the contents of the other variables, and their type

#@titlesolution
above_the_average
type(above_the_average)

message
type(message)

scores
type(scores)
list

Getting comfortable with Python#

Python is a “civilized” language, which means it provides lots of help tools to make learning the language easy for beginners. We’ll now learn about some of these tools including, “doc strings” (help menus) and introspection tools for looking at what attributes and methods are available to use.

This combination of tools allows programmers to answer common questions about Python objects and functions without leaving the JupyterLab environment. Basically, in Python all the necessary info is accessible directly in the coding environment. For example, at the end of this section you’ll be able to answer the following questions on your own:

  • How many and what type of arguments does the function print expect?

  • What kind of optional, keyword arguments does the function print accept?

  • What attributes and methods does the Python object obj have?

  • What variables and functions are defined in the current namespace?

More than 50% of any programmer’s time is spent looking at help info and trying to understand the variables, functions, objects, and methods they are working with, so it’s important for you to learn these meta-skills.

Showing the help info#

Every Python object has a “doc string” associated with it, that provides the helpful information about the object. There are three equivalent ways to view these docstring of any Python object obj (value, variable, function, module, etc.):

  • help(obj): prints the docstring of the Python object obj

  • obj?

  • SHIFT + TAB: while cursor on top of Python variable or function

There are also other methods for getting more detailed, as part of the menu obj??, %psource obj, %pdef obj, but you won’t need this for now.

Example: learning about the print function#

Let’s say you’re interested to know the options available for the function print, which we use to print Python expressions.

# put cursor in the middle of function and press SHIFT+TAB
print
<function print>

You know this function accepts a variable and prints it, but what other keywords arguments does it take?

Use the help() function on print

help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

Reading the doc string of the print function suggests, we see ... for the type of inputs accepted, which means print accepts multiple arguments.

Here is an example that prints a string, an integer, a floating point number, and a list on the same line.

# print?
# print(print.__doc__)

Application: changing the separator when printing multiple values#

We can choose a different separator between arguments of the print function by specifying the value for the keyword argument sep.

x = 3
y = 2.3
print(x, y)
3 2.3
print(x, y, sep=" --- ")
3 --- 2.3

Exercises#

Exercise 1: print the doc-string of the function len.

#@titlesolution
help(len)
Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.

Exercise 2: print the doc-string of the function sum.

#@titlesolution
help(sum)
Help on built-in function sum in module builtins:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.

Exercise 3: print the doc strings of other functions we’ve seen so far: int, float, type, etc.

#@titlesolution
# help(int)
# help(float)
# help(type)

(sidenote) Python comments#

You can write comment in Python code using the character #. Comments can be very useful to provide additional information that explains what the code is trying to do.

# this is a comment

You can also add longer, multi-line comments using triple-quoted text.

"""
This is a longer comment,
which is written on two lines.
"""
'\nThis is a longer comment,\nwhich is written on two lines.\n'

The docstrings we talked about earlier, are exactly this kind of multi-line strings included in the source code of the functions len, print, etc.

Exercise 4: replace the ... in the code block with comments that explain the calculation “adding 10% tax to a purchase that costs $57” that is being computer.

cost = 57.00           # ...
taxes = 0.10 * cost    # ...
total = cost + taxes   # ...
total                  # ...
62.7
#@titlesolution
cost = 57.00           # price before taxes
taxes = 0.10 * cost    # 10% taxes = 0.1 times price
total = cost + taxes   # add price + taxes and store the result in total
total                  # print the total
62.7

Inspecting Python objects#

Suppose you’re given the Python object obj and you want to know what it is, and learn what you can do with it.

Displaying the object#

There are several built-in functions that allow you to display information about the any object obj.

  • type(obj): tells you what type of object it is

  • print(obj): converts the object to str and prints it

  • repr(obj): similar to print, but prints the complete string representation (including quotes).
    The output of repr(obj) contains all the information needed to reconstruct the object obj.

We’ve already used both type and print, so there is nothing new here. I just wanted to remind you you can always use these functions as first line of inspection.

obj = 3

type(x)
int
print(obj)
3
repr(obj)
'3'

Auto-complete object attributes and methods#

JupyterLab notebook environment provides very useful “autocomplete” functionality that helps us look around at the attributes and methods of any object.

  • TAB button: type obj. then press TAB button.

  • dir(obj): shows the “directory” of all the attributes and methods of the object obj

message = "Hello everyone"
# message.   <TAB>
# message.upper()
# message.lower()
# message.split()
# message.replace("everyone", "world")

(bonus) See what’s in the global namespace#

In a Jupyter notebook, you can run the command %whos to print all the variables and functions that defined in the current namespace.

# %whos

Python error messages#

Sometimes the command you evaluate will cause an error, and Python will print an error message describing the problem it encountered. You need to be mentally prepared for these errors, since they can be very discouraging to see. The computer doesn’t like what you entered. The output is a big red box, that tells you your input is REJECTED!

Examples of errors include SyntaxError, ValueError, etc. The error messages look scary, but really they are there to help you—if you read what the error message is telling you, you’ll know exactly what you need fix in your input. The error message literally describes the problem!

Let’s look at an example expression that Python cannot compute, so it raises an exception. The code cell below shows an example error that occurs when you ask Pyhton to computer a math expressions that doesn’t make sense.

Let’s look at an example of an error:

3/0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[45], line 1
----> 1 3/0

ZeroDivisionError: division by zero

You’ll see these threatening looking messages on a red background any time Python encounters an error when trying to run the commands you specified. This is nothing to be alarmed by. It usually means you made a typo (symbol not defined error), forgot a syntax element (e.g. (, ,, [, :, etc.), or tried to compute something impossible, like dividing a number by zero as in the above example.

You get an error since you’re trying to compute an expression that contains a division by zero. Python tell you a ZeroDivisionError: division by zero has occurred. Indeed it’s not possible to divide a number by zero.

The way to read these red messages is to focus on the name of the explanation message that gets printed on the last line. The error message tells you what you need to fix. The solution will be obvious for typos and syntax errors, but for more complicated situations, you may need to search online to find what the problems is.

In our example, we can fix the 3/0 by replacing it with 3/1.

3/1
3.0

Here is a list of the most common error messages you are likely to encounter:

  • SyntaxError: you typed in something wrong (usually missing ” ] or some other punctuation)

  • NameError: Raised when a variable is not found in local or global scope.

  • KeyError: Raised when a key is not found in a dictionary.

  • TypeError: Raised when a function or operation is applied to an object of incorrect type.

  • ValueError: Raised when a function gets an argument of correct type but improper value.

  • ZeroDivisionError: Raised when the second operand of division or modulo operation is zero.

  • later on we’ll also run into: ImportError and AttributeError

There are many more error types. You can see a complete list by typing in the command *Error?.

Exercise: let’s break Python! 🔨🔨🔨🔨🔨🔨
Try typing in Python commands that would causes one of the above errors.

#@titlesolution

# SyntaxError
# [1

# NameError
# zz + 3

# TypeError
# sum(3)

# ValueError
# int("zz")

# ZeroDivisionError
# 5/0

# KeyError
# d = {}
# d["zz"]

# ImportError
# from math import zz

# AttributeError
# "hello".zz

Python documentation#

The official Python documentation website https://docs.python.org provides loads and loads of excellent information for learning Python.

Here are some useful links:

I encourage you to browse the site to familiarize yourself with the information that is available.

Usually when you do a google search, the official Python docs will show up on the first page of results. Make sure to prefer reading the official documentation instead of other “learn Python” websites (currently the first few google search results that show up point to SEO-optimized, spammy, advertisementfull, websites which are inferior to the official documentation). Always prefer the official docs (even if it appears lower in the list of results on the page). Stack overflow discussions are also a good place to find answers to common Python questions.

Lists and for loops#

Lists#

To create a list:

  • start with an opening square bracket [ ,

  • then put the elements of the list separated by commas ,

  • finally close the square bracket ]

For example, scores is a list of ints that we’ve seen in several examples before.

scores = [61, 79, 98, 72]
scores
[61, 79, 98, 72]

A list container has a length, which you can obtain by calling the len function.

len(scores)
4

You check if a list contains a certain element using the in operator:

98 in scores
True

List access syntax#

Elements of a the list are accessed using the square brackets [<index>], where <index> is the the 0-based index of the element we want to access:

  • The first element has index 0.

  • The second element has index 1.

  • The last element has index equal to the length of the list minus one.

# first element in the list scores
scores[0]
61
# second element in the list scores
scores[1]
79
# last element in the list scores
scores[3]
72

Another way to access the last element in the list is to use the negative index -1:

scores[-1]
72

List slicing#

We can access a subset of the list using the “slice” syntax a:b, which corresponds to the range of indices a, a+1, …, b-1. For example, if we want to extract the first three elements in the list scores, we can use the slice 0:3, which is equivalent to requesting the range of indices 0, 1, and 2.

scores[0:3]
[61, 79, 98]

Note the result of selecting a slice from a list is another list (a list that contains the subset of the original list that consists of elements whose index is included in the slice).

List methods#

List objects can be modified using a their methods. Every list has the following useful methods:

  • .sort(): sort the list (in increasing order by default)

  • .append(): add one element to end of a list

  • .pop(): extract the last element from a list

  • .reverse(): reverse the order of elements in the list

Let’s look at some examples of these methods in action.

To sort the list of scores, you can call its .sort() method:

scores.sort()
scores
[61, 72, 79, 98]

To add a new element el to the list (at the end), use the method .append(el):

scores.append(22)
scores
[61, 72, 79, 98, 22]

The method .pop() extracts the last element of the list:

scores.pop()
22

You can think of .pop() as the “undo method” of the append operation.

To reverse the order of elements in the list, call its .reverse() method:

scores.reverse()
scores
[98, 79, 72, 61]

Other useful list methods: .insert(index,obj), .remove(obj), and about a dozens more that might be useful once in a while.

Recall you can see a complete list of all the methods on list objects by typing scores. then pressing the TAB button to trigger the auto-complete suggestions. Uncomment the following code block, place your cursor after the dot, and try pressing TAB to see what happens.

# scores.

Exercise: The default behaviour of the method .sort() is to sort the elements in increasing order. Suppose you want sort the elements in decreasing order instead. You can pass a keyword argument to the method .sort() to request the sorting be done in “revese” order (decreasing instead of increasing). Consult the docstring of the .sort() method to find the name of the keyword argument that does this, then modify the code below to sort the elements of the list scores in decreasing order.

scores.sort()
scores
[61, 72, 79, 98]
#@titlesolution
# help(scores.sort)
scores.sort(reverse=True)
scores
[98, 79, 72, 61]

For loops#

We often want to repeat some operation (or several operations) once for each element in a list. This is what the for-loop is for.

The syntax of a for loop in Python looks like like this:

for <element> in <container>:
    operation 1 using <element>
    operation 2 using <element>
    etc.

that allows to repeat a block of operations for each element <element> in the list <container>.

Example 1: print all the scores#

scores = [61, 79, 98, 72]

for score in scores:
    print(score)
61
79
98
72

Example 2: compute the average score#

If \(\mathbf{x}\) is a list of values \([x_0,x_1,x_2,\ldots,x_{n-1}]\), the average of the list is defined as:

\[ \overline{x} = \text{mean}(\mathbf{x}) = \tfrac{1}{n} \left[ x_0 + x_1 + x_2 + \cdots + x_{n-1} \right] \]

In words, the average value of a list of values is the sum of the values divided by the length of the list, which is \(\texttt{len}(\mathbf{x})\) in Python.

Let’s write a for loop that computes the sum (the total) of the values in the list scores. We can then compute the average avg by dividing the total by the \(n\), which is the length of the list.

total = 0
for score in scores:
    total = total + score

avg = total / len(scores)
avg
77.5

Sidenote#

The name of the variable used for the for loop is totally up to you, but in general you should choose logical names for elements of the list. Below is an example of a for loop that uses the single-letter variable s as the loop variable:

for s in scores:
    print(s)
61
79
98
72

By conventions, we usually call lists of obj-items objs, and use the name obj for the for-loop variable. Here are some examples:

  • given a list of profiles profiles, use a the for-loop for profile in profiles: ...

  • given a list of graph nodes nodes, use a the for-loop like for node in nodes: ...

  • etc.

Anyone who is reading this Python code examples will immediately know that Python objects profiles and nodes are list-like, since they end in “s” and are used in for loops.

List comprehension (bonus topic)#

Very often when programming, we need to transform a list of values, by applying the same operation to each value in a list of inputs, and collecting the results in a new list of outputs.

Using the standard for-loop syntax, this operation requires four lines of code:

newlist = []
for value in values:
    newvalue = <some operation on `value`>
    newvalues.append(newvalue)

This code start by creating an empty list called newlist, the uses a for-loop to apply <some operation> to each element in the list values, accumulating the results in the newlist.

Python provides a shorthand syntax for writing operation as

newlist = [<some operation on `value`> for value in values]

This is called the “list comprehension” syntax, and is used often in Python code. Note the code using list comprehension takes one line to express the entire transformation from values to newlist.

Example: compute the squares of the first five integers#

numbers = [1, 2, 3, 4, 5]
squares = [n**2 for n in numbers]
squares
[1, 4, 9, 16, 25]
# # ALT. using the `range` function
# numbers = range(1,6)
# squares = [n**2 for n in numbers]
# squares

Functions#

Functions! Finally we get to the good stuff! Functions are an important building block in programming, since they allow you to encapsulate any multi-step program or procedure as reusable piece of functionality. You can think of functions as reusable chunks of code that can be defined once, and used multiple times by “calling” them with different arguments.

Let’s do a quick review the concepts of a function in mathematics, since the Python syntax for functions is inspired by math functions. The convention in math to call function \(f\), denote the inputs of the function \(x\), and its outputs \(y\):

\[ y = f(x) = \text{some expression involving } x \]

Note we defined the function \(f\) by simply writing some expression we need to compute, that depends on the input \(x\). For example \(f(x) = 2x+3\).

Python functions#

Functions in Python are similar to functions in math: a Python function takes certain inputs and produces certain outputs. We define the function called f using the following syntax:

def f(x):
    <steps to compute y from x>
    return y

There are a lot of things going on in this code example, so let’s go over the code line-by-line and explain all the new elements of syntax:

  • The Python keyword def is used to declare a function definition.

  • Next we see the name of the function f

  • Next we must specify the inputs of the function (a.k.a. arguments) inside parentheses. In this example, the function f, it takes only a single argument x.

  • The colon : indicates the beginning of a code block, which contains the the function body. The function body consists of one or more lines of code, indented by four spaces, just like the other types of code blocks we have seen.

  • The final line of the function f, we use the return statement to indicate the output of the function. The return statement is usually the last line in the function body.

Example 1#

A first example of a simple math-like function. The function is called f, takes numbers as inputs, and produces numbers as outputs:

def f(x):
    return 2*x + 3

To call the function f, we use the function name, the pass in the argument(s) of the function in parentheses.

f(10)
23

Example 2#

# TODO

Statistics functions#

Your turn to play with lists now! Complete the code required to implement the functions mean and std below.

Question 1: Mean#

The formula for the mean of a list of numbers \([x_1, x_2, \ldots, x_n]\) is: $\( \text{mean} = \overline{x} = \frac{1}{n}\sum_{i=1}^n x_i = \tfrac{1}{n} \left[ x_1 + x_2 + \cdots + x_n \right]. \)$

Write the function mean(numbers): a function that computes the mean of a list of numbers

def mean(numbers):
    """
    Computes the mean of the `numbers` list using a for loop.
    """
    total = 0
    for number in numbers:
        total = total + number
    return total / len(numbers)  


mean([100,101])
100.5
# TEST CODE (run this code to test you solution)
from test_helpers import test_mean

# RUN TESTS
test_mean(mean)
All tests passed. Good job!

Question 2: Sample standard deviation#

The formula for the sample standard seviation of a list of numbers is: $\( \text{std}(\textbf{x}) = s = \sqrt{ \tfrac{1}{n-1}\sum_{i=1}^n (x_i-\overline{x})^2 } = \sqrt{ \tfrac{1}{n-1}\left[ (x_1-\overline{x})^2 + (x_2-\overline{x})^2 + \cdots + (x_n-\overline{x})^2\right]}. \)$

Note the division is by \((n-1)\) and not \(n\). Strange, no? You’ll have to wait until stats to see why this is the case.

Write compute_std(numbers): computes the sample standard deviation

import math

def std(numbers):
    """
    Computes the sample standard deviation (square root of the sample variance)
    using a for loop.
    """
    avg = mean(numbers) 
    total = 0
    for number in numbers:
        total = total + (number-avg)**2
    var = total/(len(numbers)-1)    
    return math.sqrt(var)

numbers = list(range(0,100))
std(numbers)
29.011491975882016
# compare to known good function...
import statistics
statistics.stdev(numbers)
29.011491975882016
# TEST CODE (run this code to test you solution)
from test_helpers import test_std

# RUN TESTS
test_std(std)
All tests passed. Good job!

Exercise 2#

Write a Python function called temp_C_to_F that converts C to F

def temp_C_to_F(temp_C):
    """
    Convert the temprate `temp_C` to Farenheit.
    """
    pass  # replace `pass` with your code
#@titlesolution
def temp_C_to_F(temp_C):
    """
    Convert the temprate `temp_C` to Farenheit.
    """
    temp_F = (temp_C * 9/5) + 32
    return temp_F

temp_C_to_F(37.8)
100.03999999999999

Other data structures#

We already discussed lists, which is the most important data structure (container for data) in Python.

In this section we’ll briefly introduce some other data structures you might encounter.

Strings#

String expressions#

Let’s look at some expressions that involve strings.

name = "julie"
message = "Hello " + name    # for strings, + means concatenate
message
'Hello julie'
first_name = "Julie"
last_name = "Tremblay"
full_name = first_name + " " + last_name
message = "Hi " + full_name + "!"
message
'Hi Julie Tremblay!'

Strings are lists of characters#

You can think of the string "abc" a being equivalent to a list of three characters ["a", "b", "c"], and use the usual list syntax to access the individual characters in the list.

To illustrate this list-like behaviour of strings, let’s define a string of length 26 that contains all the lowercase Latin letters.

letters = "abcdefghijklmnopqrstuvwxyz"
letters
'abcdefghijklmnopqrstuvwxyz'
len(letters)
26

We can access the individual characters within the using the square brackets. For example, the index of the letter "a" in the string letters is 0:

letters[0]
'a'

The index of the letter "b" in the string letters is 1:

letters[1]
'b'

The last element in list of 26 letters has index 25

letters[25]
'z'

Alternatively, we can access the last letter using the negative index -1:

letters[-1]
'z'

We can use slicing to get any substring that spans a particular range of indices. For example, the first four letters of the alphabet are are:

letters[0:4]
'abcd'

The syntax 0:4 is a shorthand for the expression slice(0,4), which corresponds to the range of indices from 0 (inclusive) to 4 (non-inclusinve): [0,1,2,3].

Type conversions#

We sometimes need to convert between variables of different types. The functions for conversing types are have the same name as the type of an object:

  • int : convert any expression into an int

  • float: convert any expression into a float

  • str: convert an expression to its text representation.

Example: converting str to float#

type("42.5")
str
f = float("42.5")
f
42.5
type(f)
float

Exercise 6: compute the sum of two strings#

Suppose we’re given two numbers \(m\) and \(n\) and we want to compute their sum \(m+n\). The two numbers are given to use given expressed as strings.

mstr = "2.1"
nstr = "3.4"
print("The variable mstr has value", mstr, "and type", type(mstr))
print("The variable nstr has value", nstr, "and type", type(nstr))
The variable mstr has value 2.1 and type <class 'str'>
The variable nstr has value 3.4 and type <class 'str'>

Let’s try adding the two numbers together to see what happens…

mstr + nstr
'2.13.4'

This is because the addition operator + for strings means concatenate, not add. Python doesn’t know automatically that the two text strings are mean to be numbers.

We have to manually convert the strings to a Python numerical type (float) first, then we can add them together.

Write the Python code that converts the variables mstr and nstr to floating point numbers and add them together.

#@titlesolution
mfloat = float(mstr)
nfloat = float(nstr)
print("The variable mfloat has value", mfloat, "and type", type(mfloat))
print("The variable nfloat has value", nfloat, "and type", type(nfloat))


# compute the sum
mfloat + nfloat
The variable mfloat has value 2.1 and type <class 'float'>
The variable nfloat has value 3.4 and type <class 'float'>
5.5

Exercise write the Python code that converts a list of string variables prices_str to floating point numbers and add them together.

prices_str = ["22.2", "10.1", "33.3"]

# write here the code that computes the total price
#@titlesolution
prices_str = ["22.2", "10.1", "33.3"]
prices_float = [float(price) for price in prices_str]
sum(prices_float)
65.6

Boolean variables and conditional statements#

Boolean variables can have one of two possible values, either True or False. We obtain boolean values when we perform numerical comparisons.

x = 3
x > 2  # Is x greater than 2?
True

Other arithmetic comparisons include <, >=, <=, == (equal to), != (not equal to).

The in operator can be used to check if an object is part of a list (or another kind of collection).

x = 3
x in [1,2,3,4]  # Is x in the list [1,2,3,4] ?
True

Boolean expressions are used in conditional statements, which are blocks of Python code that may or may not be executed depending on the value of a boolean expression.

Conditional statements#

Conditional control flow between code block alternatives.

if True:
    print("This code will run")

if False:
    print("This code will not run")
This code will run
x = 3
if x > 2:
    print("x is greater than 2")
else:
    print("x is less than or equal to 2")
x is greater than 2

We can do multiple checks using elif statements.

temp = 25

if temp > 22:
    print("It's hot!")
elif temp < 10:
    print("It's cold!")
else:
    print("It's OK.")
It's hot!

Exercise: add another condition to the above code to print It's very hot if the temperature is above 30.

Boolean expressions#

You can use bool variables and the logical operations and, or, not, etc. to build more complicated boolean expressions (logical conjunctions, disjunctions, and negations).

True and True, True and False, False and True, False and False
(True, False, False, False)
True or True, True or False, False or True, False or False
(True, True, True, False)
x = 3
x >= 0 and x <= 10
True
x < 0 or x > 10
False

Exercise 1 The phase of water (at sea-level pressure = 1 atm = 101.3 kPa = 14.7 psi) , depends on its temperature temp. The three possible phases of water are "gas" (water vapour), "liquid" (water), and "solid" (ice). The table below shows the phase of water depending on the temperature temp, expresses as math inequalities.

temp range          phase
---------------     -----
temp >= 100         gas
0 <= temp < 100     liquid
temp < 0            solid

Your task is to fill-in the if-elif-else statement in the code cell below, in order to print the correct phase string, depending on the value of the variables temp.

# temperature in Celcius (int or float)
temp = 90

# if ...:
#     print(....)
# elif ...:
#     print(....)
# else:
#     print(....)


# uncomment the code if-elif-else statement above and replace:
#   ... with conditions (translate math inequaility into Python code),
#  .... with the appropriate phase string (one of "gas", "liquid", or "solid")
#@titlesolution
temp = 90

if temp >= 100:
    print("gas")
elif temp >= 0:
    print("liquid")
else:
    print("solid")
liquid

Exercise 2 Teacher Joelle has computed the final scores of the students as a percentage (a score out of 100). The final grade was computed as a weighted combination of the student’s average grade on the assignments, one midterm exam, and a final exam (more on this later).

The school where she teachers, requires her to convert each student’s score to a letter grade, according to the following grading scale:

Grade         Numerical score interval
A             85% – 100%
A-            80% – 84.999…%
B+            75% – 79.999…%
B             70% – 74.999…%
B-            65% – 69.999…%
C+            60% – 64.999…%
C             55% – 59.999…%
D             50% – 54.999…%
F             0% – 49.999…%

Write the if-elif-elif-…..-else statement that takes the score variable (an int between 0 and 100), and prints the appropriate letter grade for that score.

# student score as a percentage
score = 90

# if ...:
#     print(....)
# elif ...:
#     print(....)
# .....

# uncomment the code if-elif-.. statement above and replace:
#   ... with the appropriate conditions,
#  .... with the appropriate letter grades (strings), and
# ..... with additional elif blocks to cover all the cases.
#@titlesolution
score = 73

if score >= 85:
    print("A")
elif score >= 80:
    print("A-")
elif score >= 75:
    print("B+")
elif score >= 70:
    print("B")
elif score >= 65:
    print("B-")
elif score >= 60:
    print("C+")
elif score >= 55:
    print("C")
elif score >= 50:
    print("D")
else:
    print("F")
B

Inline if statements (bonus topic)#

We can also use if-else keywords to compute conditional expressions. The general syntax for these is:

<value1> if <condition> else <value2>

This expressions evaluates to <value1> if <condition> is True, else it evaluates to <value2> when <condition> is False.

temp = 25
msg = "It's hot!" if temp > 22 else "It's OK."
msg
"It's hot!"

Dictionaries#

When programming in Python, one of the most commonly used data structures, are dictionaries dict, and other dict-like data structures.

A dictionary is an associate array between a set of keys and a set of values. For example, the code below defines dictionary d that consists of the three keys-value pairs:

d = {"key1":"value1", "key2":"value2", "key3":"value3"}
d
{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}
d.keys()
dict_keys(['key1', 'key2', 'key3'])
d.values()
dict_values(['value1', 'value2', 'value3'])

You access the value in the dictionary using the square brackets syntax. For example, to see the value associated with key key1 in the dictionary d, we call:

d["key1"]
'value1'

You can change the value associate with any key by assigning a new value to it:

d["key2"] = "newval2"
d
{'key1': 'value1', 'key2': 'newval2', 'key3': 'value3'}

Exercise: creating a new profile dictionary#

Recall the profile dictionary we created earlier:

profile = {"first_name":"Julie",
           "last_name":"Tremblay",
           "score":98}

Create a dictionary called profile2 with the same structure, but for the user "Justin Trudeau" with score 31.

profile2 = {}
profile2["first_name"] = "Justin"
profile2["last_name"] = "Trudeau"
profile2["score"] = 31

profile2
{'first_name': 'Justin', 'last_name': 'Trudeau', 'score': 31}

Sets#

s = set()
s
set()
s.add(3)
s.add(5)
s
{3, 5}
3 in s
True
print("The set s contains the elements:")
for el in s:
    print(el)
The set s contains the elements:
3
5

Tuples#

Tuples are similar to lists but with less features.

2,3
(2, 3)
(2,3)
(2, 3)

We can use the tuples syntax to assign to multiple variables on a single line:

x, y = 3, 4

We can also use tuples to “swap” two values.

# Swap the contexts of the variables x and y
tmp = x
y = x
x = tmp
# Equivalent operation on one line
x, y = y, x

Objects and classes#

All the Python variables we’ve been using until now are different kinds of “objects.” An object is a the most general purpose “container” for data, that also provides methods for manipulating this object.

In particular:

  • attributes: data properties of the object

  • methods: functions attached to the object

Example 1: string objects#

msg = "Hello"

type(msg)
str
# Uncomment the next line and press TAB after the dot
# msg.
# Attributes
# Methods:
msg.upper()
msg.lower()
msg.__len__()
msg.isascii()
msg.startswith("He")
msg.endswith("lo")
True

Example 2: file objects#

filename = "message.txt"
file = open(filename, "w")

type(file)
_io.TextIOWrapper
# Uncomment the next line and press TAB after the dot
# file.
# Attributes
file.name, file.mode, file.encoding
('message.txt', 'w', 'UTF-8')
# Methods:
file.write("Hello world\n")
file.writelines(["line2", "and line3."])
file.flush()
file.close()

Defining new types of objects#

Using the Python keyword class can be used to define new kinds of objects.

Exercise: create a custom class of objects Interval that represent intervals of real numbers like \([a,b] \subset \mathbb{R}\). We want to be able to use the new interval objects in if statements to check if a number \(x\) is in the interval \([a,b]\) or not.

Recall the in operator that we can use to check if an element is part of a list

>>> 3 in [1,2,3,4]
True

we want the new objects of type Interval to test for membership.

Example usage:

>>> 3 in Interval(2,4)
True
>>> 5 in Interval(2,4)
False

The expression x in Y is corresponds to calling the method __contains__ on the container object Y: Y.__contains__(x) and it will return a boolean value (True or False).

If we want to support checks like 3 in Interval(2,4) we therefore have to implement the method __contains__ on the Interval class.

class Interval:
    """
    Object that embodies the mathematical concept of an interval.
    `Interval(a,b)` is equivalent to math interval [a,b] = {𝑥 | 𝑎 ≤ 𝑥 ≤ 𝑏}.
    """

    def __init__(self, lowerbound, upperbound):
        """
        This method is called when the object is created, and is used to
        set the object attributes from the arguments passed in.
        """
        self.lowerbound = lowerbound
        self.upperbound = upperbound

    def __str__(self):
        """
        Return a representation of the interval as a string like "[a,b]".
        """
        return "[" + str(self.lowerbound) + "," + str(self.upperbound) + "]"

    def __contains__(self, x):
        """
        This method is called to check membership using the `in` keyword.
        """
        return self.lowerbound <= x and x <= self.upperbound

    def __len__(self):
        """
        This method will get called when you call `len` on the object.
        """
        return self.upperbound - self.lowerbound

Create an object that corresponds to the interval \([2,4]\).

interval2to4 = Interval(2,4)
interval2to4
<__main__.Interval at 0x7fb558297160>
type(interval2to4)
__main__.Interval
str(interval2to4)
'[2,4]'
3.3 in interval2to4
True
1 in interval2to4
False
len(interval2to4)
2

Python libraries and modules#

Everything we discussed so far was using the Python built-in functions and data types, but that is only a small subset of all the functionality available when using Python. There are hundreds of Python libraries and modules that provide additional functions and data types for all kinds of applications. There are Python modules for processing different data files, making web requests, performing computations, etc. The list is almost endless, and the vast number of libraries and frameworks is all available to you behind a simple import statement.

The three golden rules of software development:

  1. Don’t write code because someone has already solved the problem you’re trying to solve.

  2. Don’t write code because you can glue together one or more libraries to do what you want.

  3. Don’t write code because you can solve your problem by using some subset of the functionality in an existing framework.

The import statement#

We use the import statement to load a python module and make it available in the current context. The code below shows how to import the module <mod> in the current notebook.

import <mod>

After this statement, we can now use the functions in the module <mod> by calling them using the prefix <mod>., which is called the “dot notation” for accessing within the namespace <mod>.

For example, let’s import the statistics module and use the function statistics.mean to compute the mean of three numbers.

import statistics

statistics.mean([1,2,6])
3

A very common trick you’ll see in Python notebooks, is to import python modules under an “alias” name, which is usually a shorter name that is faster to type.

The alias-import statement looks like this:

import <mod> as <alias>

For example, let’s import the statistics module under the alias stats and repeat the mean calculation we saw above.

import statistics as stats

stats.mean([1,2,6])
3

As you can imagine, if you’re writing some Python code that requires calling a lot of statistics calculations, you’ll appreciate the alias-import statement, since you call stats.mean and stats.median instead of having to type the full module name each time, statistics.mean and statistics.median.

The standard library#

The Python standard library consists of several dozens of Python modules that come bundled with every Python installation.

Here are some modules that come in handy.

  • math: math functions like sqrt, sin, cos, etc.

  • random: random number generations

  • statistics: descriptive statistics computed from lists of values.

  • re: regular expressions (useful for matching patterns in strings)

  • datetime: manipulate dates and times.

  • urllib.parse: manipulate URLs (used for web programming).

  • json: read and write JSON files.

  • csv: read and write CSV files (see also Pandas, which can do this too)

  • os and os.path: manipulate file system paths.

  • sys: access information about the current process and the operating system.

There are also a few libraries that are not part of the standard library, but almost as important:

  • requests: make HTTP requests and download files from the internet.

Installing Python packages with pip#

We use the command pip or %pip to install Python packages.

Scientific computing libraries#

NumPy#

Numerical Python (NumPy) is a library that provides high-performance arrays and matrices. NumPy arrays allow mathematical operations to run very fast, which is important when working with medium- and large- datasets.

Example: linspace and other numerical calculations#

SciPy#

Scientific Python (SciPy) is a library that provides most common algorithms and special functions used by scientists and engineers. See https://scipy-lectures.org/

SymPy#

Symbolic math expressions. See sympy_tutorial.pdf.

Matplotlib#

Powerful library for plotting points, lines, and other graphs.

Examples: how to create reusable functions for plotting probability distributions#

  • plot_pdf_and_cdf

  • calc_prob_and_plot

  • calc_prob_and_plot_tails

Data science libraries#

  • pandas library for tabular data (See pandas_tutorial.ipynb notebook)

  • statsmodels models for linear regression and other

  • seaborn high-level library for statistical plots (See seaborn_tutorial.ipynb notebook).

  • plotnine another high-level library for data visualization base don the grammar of graphics principles

  • scikit-learn tools and algorithms for machine learning

Bonus topics#

Writing standalone scripts#

For loop tricks#

Tricks:

  • enumerate: provides an index when iterating through a list.

  • zip: allows you to iterate over multiple lists in parallel.

Using enumerate to get for-loop with index#

Use enumerate(somelist) to iterate over tuples (index, value), from a list of values from the list somelist. In each iteration, the index tells you the index of the value in the current iteration.

list(enumerate(scores))
[(0, 61), (1, 79), (2, 98), (3, 72)]
# example usage
for idx, score in enumerate(scores):
    # this for loop has two variables index and score
    print("Processing score", score, "which is at index", idx, "in the list")
Processing score 61 which is at index 0 in the list
Processing score 79 which is at index 1 in the list
Processing score 98 which is at index 2 in the list
Processing score 72 which is at index 3 in the list

Using zip#

Use zip(list1,list2) to get an iterator over tuples (value1, value2), where value1 and value2 are elements taken from list1 and list2, in parallel, one at a time.

The name “zip” is reference to the way a zipper joins together the teeth of the two sides of the zipper when it is closing.

# example 1
list( zip([1,2,3], ["a","b","c"]) )
[(1, 'a'), (2, 'b'), (3, 'c')]
# example 2
list1 = [1, 2, 3]
list2 = [4, 5, 6]

list(zip(list1, list2))
[(1, 4), (2, 5), (3, 6)]
# compute the sum of the matching values in two lists
for value1, value2 in zip(list1, list2):
    print("The sum of", value1, "and", value2, "is", value1+value2)
The sum of 1 and 4 is 5
The sum of 2 and 5 is 7
The sum of 3 and 6 is 9

Functional programming helpers#

functools.partial for currying functions (e.g sample-generator callables)

List-like objects = iterables#

The term “iterable” is used in Python to refer to any list-like object that can be used in a for-loop.

Examples of iterables:

  • strings

  • dictionary keys, dictionary values, or dictionary (key,value) items

  • sets

  • range (lazy generator for lists of integers)

range(0, 4)
range(0, 4)
list(range(0, 4))
[0, 1, 2, 3]

Iterating over dictionaries#

profile = {"first_name":"Julie", "last_name":"Tremblay", "score":98}
list(profile.keys())
['first_name', 'last_name', 'score']
# ALT.
list(profile)
['first_name', 'last_name', 'score']
list(profile.values())
['Julie', 'Tremblay', 98]
list(profile.items())
[('first_name', 'Julie'), ('last_name', 'Tremblay'), ('score', 98)]

We’ll talk more about dictionaries later on.

Converting iterables to lists#

Under the hood, Python uses all kinds of list-like data structures called iterables”. We don’t need to talk about these or understand how they work—all you need to know is they are behave like lists.

In the code examples above, we converted several fancy list-like data structures into ordinary lists, by wrapping them in a call to the function list, in order to display the results.

Let’s look at why need to use list(iterable) when printing, instead of just iterable.

For examples, the set of keys for a dictionary is a dict_keys iterable object:

profile.keys()
dict_keys(['first_name', 'last_name', 'score'])
type(profile.keys())
dict_keys

I know, right? What the hell is dict_keys? I certainly don’t want to have to explain that…

… so instead, you’ll see this in the code:

list(profile.keys())
['first_name', 'last_name', 'score']
type(list(profile.keys()))
list

Generic function arguments#

functions with *args and **kwargs arguments

Final review#

Okay we’ve reached the end of this tutorial, so let’s to review of the new Python concepts we introduced in condensed form.

Python grammar and syntax review#

Learning Python is like learning a new language:

  • nouns: values of different types, usually referred to by name (named variables containing values)

  • verbs: functions and methods, including basic operators like +, -, etc.

  • grammar: rules about how to use nouns and verbs together

  • adverbs: keyword arguments (options) used to modify what a function does

These parts are easy to understand with time, since the new concepts correspond to English words, so you’ll get use to it all very quickly.

Python keywords#

Here is a list of keywords that make up the Python language:

False      class      finally    is         return
None       continue   for        lambda     try
True       def        from       nonlocal   while
and        del        global     not        with
as         elif       if         or         yield
assert     else       import     pass
break      except     in         raise

You’ve seen most of them, but not all of them. The ones you need to remember are:

  • if, elif, else used in conditional statements

  • def used to define a new function and the return statement that defines the output of the function

  • the boolean values True and False

  • None = special value that corresponds to no value

  • class for defining new object types

  • for for for loops and list-comprehension statements

  • import ... and from ... import ... statements to import Python modules

  • or and and not

  • in to check if element is part of container

Python data types#

  • int: naturals and integers

  • float: rational and real numbers

  • list: list of objects [obj1, obj2, obj3, ...]

  • bool: True or False

  • str: text strings

  • dict: associative array between keys and values {key1:value1, key2:value2, key3:value3, ...}.

  • tuple: just like a list, but immutable (can’t modify it)

  • set: list-like object that doesn’t care about ordering

  • NoneType: Denotes the type of the None value, which describes the absence of a value (e.g. the output of a function that doesn’t return any value).

  • complex: complex numbers \(z=x+iy\)

Python built-in functions#

Essential functions:

  • print(arg1, arg2, ...): display str(arg1), str(arg2), etc.

  • type(obj): tells you what kind of object

  • len(obj): length of the object (only for: str, list, dict objs)

  • range(a,b): equivalent to the list [a,a+1,...,b-1]

  • help(obj): display info about the object, function, or method

Looking around, learning, and debugging Python code:

  • str(obj): display the string representation of the object.

  • repr(obj): display the Python representation of the object. Usually, you can copy-paste the output of repr(obj) into a Python shell to re-create the object.

  • help(obj): display info about the object, function, or method. This is equivalent to calling object’s docstring obj.__doc__.

  • dir(obj): show complete list of attributes and methods of the object obj

  • globals(): display all variables in the Python global namespace

  • locals(): display local variables (within current scope, e.g. local variables inside inside a function body)

Built-in methods used for lists:

  • len(obj): length of the object (only for: str, list, dict objs)

  • sum(li): sum of the values in the list of numbers li

  • all(li): true if all values in the list li are true

  • any(li): true if any of the values in the list li are true

  • enumerate(li): convert list of values to li to list of tuples (i,li[i]) (use in for loop as for i, item in enumerate(items):....

  • zip(li1, li2): joint iteration over two lists

  • Low-level iterator methods: iter() and next() (out of scope for this tutorial. Just know that every time I said list-like, I meant “any object that implements the iterator and iterable protocols).

Input-output (I/O):

  • input: prompt user for input, returns the value user entered as a sting.

  • print(arg1, arg2, ...): display str(arg1), str(arg2), etc.

  • open(filepath,mode): open the file at filepath for mode-operations. Use mode="r" for reading text from a file, and mode="w" for writing text to a file.

Advanced stuff:

  • Functional shit: map(), eval(), exec()

  • Meta-programming: hasattr(), getattr(), setattr()

  • Object oriented programming: isinstance(), issubclass(), super()

Python punctuation#

The most confusing part of learning Python is the use of non-word punctuation characters, which have very specific meaning that has nothing to do with English punctuation marks. Let’s review how the symbols =([{*"'#,.: are used in various Python expressions. The meaning of these symbols changes depending on the context.

Here is a complete, no-holds-barred list of the punctuation marks usage in the Python programming language. Like literally each of them. This list of symbols uses will help us close the review, since it reviews the Python syntax was used in all the sections in this tutorial.

  • Equal sign =

    • assignment

    • specify default keyword argument (in function definition)

    • pass values for keyword arguments (in function call)

  • Round brackets () are used for:

    • calling functions

    • defining tuples: (1, 2, 3)

    • enforcing operation precedence: result = (x + y) * z

    • defining functions (in combination with the def keyword, e.g. def f(x): ...)

    • defining class

    • creating object

  • Curly-brackets (accolades) {}

    • define dict literals: mydict = {"k1":"v1", "k2":"v2"}

    • define sets: {1,2,3}

  • Square brackets [] are used for:

    • defining lists: mylist = [1, 2, 3]

    • list indexing: ages[3] = 29

    • dict access by key: mydict["k1"] (used by __getitem__ or __setitem__)

    • list slice: mylist[0:2] (first two items in mylist)

  • Quotes " and '

    • define string literals

    • note raw string variant r"..." also exists

  • Triple quotes """ and '''

    • long string literals entire paragraphs

  • Hash symbol #

    • comment (Python ignores text after the # symbol)

  • Colon :

    • syntax for the beginning of indented block.
      The colon is used at the end of statements like if, elif, else for, etc.

    • key: value separator in dict literals

    • slice of indices 0:2 (first two items in a list)

  • Period .

    • decimal separator for floating point literals

    • access object attributes

    • access object methods

  • Comma ,

    • element separator in lists and tuples

    • key:value separator when creating a dict

    • separate function arguments in function definitions

    • separate function arguments when calling functions

  • Asterisk *

    • multiplication operator

    • (advanced) unpack elements of a list

  • Double asterisk **

    • exponent operator

    • (advanced) unpack elements of a dict

  • Semicolon ; (rarely used)

    • (advanced) put multiple Python commands on single line

Don’t worry if you didn’t understand all the use cases listed. I’ve tried to make the list complete, so I’ve included some more advanced topics, labeled (advanced), which you’ll learn about over time when you use Python.

Discussion#

Let’s go over some of the things we skipped in the tutorial, because they were not essential for getting started. Now that you know a little bit about Python, it’s worth mentioning some of these details, since it’s useful context to see how this “Python calculator” business works. I also want to tell you about some of the cool Python applications you can look forward to if you choose to develop your Python skills further.

Applications#

Python is not just a calculator. Python can also be used for non-interactive programs and services. Python is a general-purpose programming language so it enables a lot of applications. The list below talks about some areas where Python programming is popular.

  • command line scripts: you can put commands line scrips are written in Python, then run them on the command line (terminal on UNIX or or cmd.exe on Windows). For example, you can download any video from YouTube by running the command youtube-dl <youtube_url>. If all you want is the audio, you can use some command-line options to specify youtube-dl --extract-audio --audio-format mp3 <youtube_url> to extract the audio track from the youtube video and save it as an mp3. The author uses this type of command daily to make local copies of songs to listen to them offline.

  • graphical user interface (GUI) programs: many desktop applications are written in Python. An example of a graphical, point-and-click application written in Python is Calibre, which is a powerful eBook management library and eBook reader and eBook converter, that supports all imaginable eBook formats.

  • web applications: the Django and Flask frameworks are often used to build web applications. Many of the websites you access every day have as server component written in Python.

  • machine learning systems: create task-specific functions by using probabilistic models instead of code. Machine learning models undergo a training stage in which the model parameters are “learned” from the training data examples, after which the model can be queried to make predictions.

I mention these examples so you’ll know the other possibilities enabled by Python, beyond the basic “use Python interactively like a calculator” code examples that we saw in this tutorial.

There is a lot of other useful stuff. We’re at the end of this tutorial, but just the beginning of your journey to discover all the interesting thins you can do with Python.

Python programming#

Coding a.k.a. programming, software engineering, or software development is a broad topic, which is out of scope for this short tutorial. If you’re interested to learn more about coding, see the article What is code? by Paul Ford. Think mobile apps, web apps, APIs, algorithms, CPUs, GPUs, TPUs, SysOps, etc. There is a lot to learn about applications enabled by learning basic coding skills, it’s almost like reading and writing skills.

Learning programming usually takes several years, but you don’t need to become a professional coder to start using Python for simple tasks, the same way you don’t need to become a professional author to use writing for everyday tasks. If you reached this far in the tutorial, you know enough about basic Python to continue your journey.

In particular, you can read the other two tutorials that appear in the No Bullshit Guide to Statistics:

Learning objectives#

In this tutorial you’ll learn the following specific Python skills, which are required for probability and statistics calculations.

  • know how to define function (e.g. fH for Example 3 in Section 2.1)

  • understand list comprehension (e.g. [fH(h) for h in range(0,5)] for Example 3 in Section 2.1)

  • know built-in functions:

    • sum

    • len

    • range

  • know the general pattern for plotting the graph of function f using numpy arrays:

    1. xs = np.linspace  OR  np.arange  OR  list

    2. ys = f(xs) (

    3. Call plt.stem(ys) or sns.lineplot(x=xs, y=ys)

  • (optional) understand when we need to use vectorize(f)

  • Robyn recommends the following concepts to be covered:

    • Quotes around text (strings)

    • Assigning variables

    • Interpreting error messages

    • help(method)

    • Syntax for writing and running functions

    • definitions/explanations for the following (all used in the DATA chapter): ‘None’, ‘0-based indexing’, ‘attributes’, ‘methods’, ‘object’, ‘instance’, ‘class,’ ‘module,’ ‘accessing columns as attributes’

  • Specific requirements from PROB chapter:

    • range and summation

  • Specific requirements from STATS chapter:

  • Specific requirements from LINEAR MODELS chapter: