Appendix C — Python tutorial#
Click this binder button to run this notebooks interactively: (highly recommended, for a hands-on experience, which is much better for learning).
Python as a fancy calculator#
In this tutorial I’ll show you the basics of the Python programming language. Don’t freak out, this is not a big deal. You’re not going to become a programmer or anything, you’re just learning to use Python as a calculator.
Calculator analogy#
Python commands are similar to the commands you give to a calculator. The same way a calculator has different buttons for the various arithmetic operations, the Python language has a number of commands you can “run” or “execute.” Python is more powerful than a calculator because you can input multiple lines of commands at once, and write more complicated expressions.
The same way knowing how to use a calculator is very helpful for doing arithmetic operations, learning Python is very helpful when for doing arithmetic operations, but also doing more complicated multi-step calculations.
Why learn python?#
Staying with the python-is-a-calculator analogy, I’d like to give you an idea of some of the things you can do when you use Python as a calculator.
Python is really good as a “basic” calculator, which allows you to do calculate any arithmetic expression involving math operations like
+
,-
,*
,/
,pow
, and more generally any math function.Python programs allows you to represent procedures with multiple steps.
Python is also very useful as a graphical calculator, you can plot functions, and visualize data distributions.
Python is an extensible, programmable calculator, which means you can define custom functions that are useful for any given domain.
Python provides lots of extensions (modules) for scientific computations. You can do advanced linear algebra using
numpy
module, carry out optimization usingscipy
, and usesympy
for symbolic math calculations. Very powerful stuff.
All in all, learning Python will give you all kinds of options for doing math and science.
Python for statistics#
In this tutorial,
you’ll learn all the Python functions needed for statistics.
Learning a few basic Python constructs like the for
loop
will enable you to simulate probability distributions and experimentally verify how statistics procedures work.
This is a really big deal!
If’s good to know the statistical formula and recipes,
but it’s even better when you can run your own simulations and check when the formulas work and when they fail.
Once you learn the basics of Python syntax,
you’ll have access to the best-in-class tools for
data management (Pandas, see pandas_tutorial.ipynb),
data visualization (Seaborn, see seaborn_tutorial.ipynb),
statistics (scipy
and statsmodels
),
and machine learning (scikit-learn
, pytorch
, huggingface
, etc.).
Don’t worry there won’t be any advanced math—just sums, products, exponents, logs, and square roots.
Nothing fancy, I promise.
If you’ve ever created a formula in a spreadsheet,
then you’re familiar with all the operations we’ll see.
In a spreadsheet formula you’d use SUM(
in Python we write sum(
.
You see, it’s nothing fancy.
Yes, there will be a lot of code (paragraphs of Python commands) in this tutorial, but you can totally handle this. If you ever start to freak out an think “OMG this is too complicated!” remember that Python is just a fancy calculator.
Overview of the material in this tutorial#
We’ll cover all essential topics required to get to know Python, including:
Getting started where we’ll install JupyterLab Desktop coding environment
Expressions and variables: basic building blocks of any program.
Getting comfortable with Python: looking around and getting help.
Lists and for loops: repeating steps and procedures.
Functions are reusable code blocks.
Other data structures: sets, tuples, etc.
Boolean variables and conditional statements: conditional code execution.
Dictionaries are a versatile way to store data.
Objects and classes: creating custom objects.
Python grammar and syntax: review of all the syntax.
Python libraries and modules: learn why people say Python comes with “batteries included”
After you’re done with this tutorial, you’ll be ready to read the other two:
Pandas (see pandas_tutorial.ipynb)
Seaborn (see seaborn_tutorial.ipynb)
It’s important for you to try solving the exercises that you’ll encounter as you read along. The exercises are a great way to practice what you’ve been learning.
Getting started#
Installing JupyterLab Desktop#
JupyterLab is a platform for interactive computing that makes it easy to run Python code. You can run JupyterLab on your local computer or on a remote server (mybinder or colab). JupyterLab is based on a notebook interface that allows you to mix text, code, and graphics, which makes it the perfect tool for learning Python.
JupyterLab Desktop is a convenient all-in-one application that you can install on your computer to take advantage of everything Python has to offer for data analysis and statistics. You can download JupyterLab Desktop from the following page on GitHub: jupyterlab/jupyterlab-desktop
Choose the download link for your operating system.
After the installation completes and you launch the JupyterLab application, you should see a launcher window similar to the one shown below:
The JupyterLab interface with the File browser shown in the left sidebar. The box labelled (1) indicates the current working directory. The box labelled (2) shows the list of files in the current directory. The Launcher tab allows us to create a new notebooks, by clicking the button labelled (3).
Use the Python 3 (ipykernel) button under the to create a new Python 3 notebook,
as shown in the above screenshot.
A notebook consists of a sequence of cells,
similar to how a text document consists of a series of paragraphs.
The notebook you created currently consists of a single empty code cell,
which is ready to accept Python commands.
Type in the expression 2 + 3
(without the quotes) into the cell
then press SHIFT+ENTER to run the code.
You should see the result of the calculation displayed on a new line immediately below the input.
The cursor will automatically move to a new input cell,
as shown in the screenshot below.
A notebook with a code cell and its output labelled (1).
The cursor is currently in the cell labelled (2).
The the label (3) tells us notebook filename is Untitled.ipynb
.
The buttons labelled (4) control the notebook execution:
run, stop, restart, and run all.
The notebook interface offers many useful features, but for now, I just want you to think of notebooks as an easy way to run Python code. Notebooks will you to try Python commands interactively, which is the best way to learn! Try some Python commands to get a feeling of how notebooks work. Remember you can click the play button in the toolbar (the first button in the box labelled (4) in the above screenshot) or press the keyboard shortcutSHIFT+ENTER to run the code.
I encourage you to play around with the notebook interface, in particular the buttons labeled (3) and (4). Try clicking on the notebook execution control buttons (4) to see what they do. The play button is equivalent to pressing SHIFT + ENTER. The stop button can be used to interrupt a computation that takes too long. The run-all button are useful when you want to re-run all the cells from scratch. This is my favourite button! I use it often to recompute all the sequence of Python commands in the current notebook, from top to bottom.
Alternatives: If you don’t want to install anything on your computer yet, you have two other options for playing with this notebook:
Run JupyterLab instance in the cloud via the mybinder links. Click here to launch an interactive notebook of this tutorial.
You can also enable the “Live Code” feature while reading this tutorial online at noBSstats.com. Use the rocket button in the top right, and choose the
Live Code
option to make all the cells in this notebook interactive.
Code cells contain Python commands#
The Python command prompt is where you enter Python commands. Each of the code cells in this notebook is a command prompt that allows you to enter Python commands and “run” them by pressing SHIFT + ENTER, or by clicking the play button in the toolbar.
For example,
you can make Python compute the sum of two numbers by entering 2+3
in a code cell,
then pressing SHIFT + ENTER.
2 + 3
5
In the above code cell, the input is the expression 2 + 3
(the sum of two integers),
and the output 5
is printed below.
Let’s now compute a more complicated math expression \((1+4)^2 - 3\), which is equivalent to the following Python expression:
(1+4)**2 - 3
22
The Python syntax for math operations is identical to the notation we use in math:
addition is +
, subtraction is -
, multiplication is *
, division is /
.
The syntax for writing exponents is a little unusual, using two asterisks: \(a^n\) = a**n
.
When you run a code cell, you’re telling the computer to evaluate the Python instructions it contains. Python then prints the result of the expression in the output cell.
Running a code cell is similar to using the EQUALS button on the calculator: whatever math expression you entered, the calculator will compute its value and display it as the output. The process is identical when you execute some Python code, but you’re allowed to input multiple lines of commands at once. The computer will execute the lines of code, one by one, in the order it sees them.
The result of the final calculation in the cell gets printed as the output of that cell. You can easily change the contents of any input cell, and re-run it to observe a new output. This interactivity makes it easy to explore the code examples.
Your turn to try this!#
Try typing in some Python code expression in this cell, then run it by pressing SHIFT + ENTER or using the Play button.
Expressions and variables#
Python programming involves computing Python expressions and manipulating the data stored in variables. It’s time to learn about Python expressions and variables, starting with expressions.
Python expressions#
Let’s start by showing a simple example of a Python expression, which computes the sum of two integers.
2+3
5
In a notebook, the last expression in the code cell will be printed automatically.
If you want to print the value of some variable, we normally need to use the print
function (to be discussed later), but when working in a notebook interface, we don’t need to do that since
the last statement in each code block gets printed automatically.
Here are two examples that involve list expressions and function calls (to be discussed later).
[1, 2, 3]
[1, 2, 3]
len([1, 2, 3, 6])
4
sum([1, 2, 3])
6
If you’ve every used a spreadsheet software before,
you can think of the Python function sum(...)
as the equivalent of the spreadsheet function SUM(...)
.
Variables#
Similar to variables in math, a variable in Python is a convenient name we use to refer to any value: a constant, the input to a function x
, the output of a function y
, or any other intermediate value.
We use the assignment operator =
to store values into variables.
The assignment operator#
In the above code examples, we computed the values of various expressions, but we didn’t do anything with the result. The more common pattern in Python, is to store the result of an expression into some variable.
To store the result of an expression in a variable,
we use the assignment operator =
as follows, from left to right:
we start by writing the name of the variable
then, we add the symbol
=
(which stands for assign to)finally, we write an expression for the value we want to store in the variable
For example, here is the code that computes the value of the expression 2+3
and stores the result in the variable x
.
x = 2+3
This expression didn’t print any output: it just assigned value 5
to a the variable x
.
Note the meaning of =
is not the same as in math:
we’re not writing an equation,
but assigning the contents of x
.
Here are some other, equivalent ways to describe the assignment operation statement:
Store the result of
2+3
into the variablex
.Put
5
intox
.Save the value
5
into the memory location namedx
.Define
x
to be equal to5
.Set
x
to5
.Record the result of
2+3
under the namex
.Let
x
be equal to5
.
To display the contents of the variable x
,
we can specify its name on the last line of a code cell.
x
5
We often combine the “assign 2+3
to x
” and the “display x
” commands into a single code cell,
as shown below.
x = 2+3
x
5
Exercise 1: Imitate the above statements, to create another variable y
that contains the value of the expression \((1+4)^2 - 3\), then display the contents of y
.
# put you answer in this code cell
#@titlesolution
y = (1+4)**2 - 3
y
22
So to summarize, the syntax of an assignment statement is as follows:
<place> = <some expression>
The assignment operator (=
) is used to store the value of the expression <some expression>
into the memory location <place>
, which is usually a variable name, but later on we’ll learn how to store values inside containers like lists and dictionaries.
Multi-line expressions (Python code blocks)#
Let’s now look at some longer code examples that show multiple steps of calculations, and intermediate values.
Numerical expressions#
Example: Number of seconds in one week#
Let’s say we need to find how many seconds there are in one week. We can do this using multi-step Python calculation, using the fact that there are \(60\) seconds in one minute, \(60\) minutes in one hour, and \(24\) hours in one day, and \(7\) days in one week.
secs_in_1min = 60
secs_in_1hour = secs_in_1min * 60
secs_in_1day = secs_in_1hour * 24
secs_in_1week = secs_in_1day * 7
secs_in_1week
604800
Note we use the underscore _
as part of the variable name, which is a common pattern in Python code. The variable name some_name
is easier to read than somename
.
Exercise 2: Johnny currently weights 90 kg, and wants to know his weight in pounds lbs
. One kilogram is equivalent to 2.2 lbs.
Write the Python expression that computes Johnny’s weight in pounds.
weight_in_kg = 107
weight_in_lb = ... # replace ... with your answer
#@titlesolution
weight_in_kg = 107
weight_in_lb = 2.2 * weight_in_kg
weight_in_lb
235.4
Exercise 3: You’re buying something that costs 57.0 dollars, and the local government imposes a 10% tax on your purchase. Calculate the total you’ll have to pay, which includes the cost and 10% taxes.
cost = 57.00
taxes = ... # replace ... with your answer
total = ... # replace ... with your answer
#@titlesolution
cost = 57.00
taxes = 0.10 * cost
total = cost + taxes
total
62.7
Exercise 4:
The formula for converting a temperature from Celsius to temperature in Fahrenheit
is given by \(F = \tfrac{9}{5} \cdot C + 32\).
Given the variable C
which specifies the current temperature in Celsius,
write the expression that calculates the current temperature in Fahrenheit
and store it in the variable F
.
C = 20
# F = ... # (un-comment and replace ... with the correct expression)
Test: when C = 100
, your answer F
should be 212
.
#@titlesolution
C = 20
F = (C * 9/5) + 32
F
68.0
Variable types#
Most common variables types#
There are multiple types of variables in Python:
int - integers ex:
34
,65
,78
,-4
, etc. (rougly equivalent to \(\mathbb{Z}\))float - ex:
4.6
,78.5
,1e-3
(full name is “floating point number”; similar to \(\mathbb{R}\) but only with finite precision)bool - a Boolean truth value with only two choices:
True
orFalse
.string - text content like
"Hello"
,"Hello everyone"
. Text strings are denoted using either double quotes"Hi"
or single quotes'Hi'
.list a sequence of values like
[61, 79, 98, 72]
. The beginning and the end of the list are denoted by the brackets[
and]
, and its elements are separated by commas.Other types include string, dictionary, tuples, sets, functions, objects, etc. These are other useful Python building blocks which we’ll talk about in later sections.
Let’s look at some examples with variables of different types:
an int
eger, a float
ing point number, a bool
ean value, a str
ing,
a list, and a dict
ionary.
score = 98
average = 77.5
above_the_average = True
message = "Hello everyone"
scores = [61, 79, 98, 72]
Running the above code cell doesn’t print anything,
because we have only defined variables: score
, average
, above_the_average
, message
, scores
, and profile
, but not displayed any of them.
To display the value of any of these variables,
we can use it’s name on the command line.
Let’s see the contents the score
variable:
score
98
The function type
tells you the type of any variable, meaning what kind of number or object it is.
type(score)
int
In this case, value of the variable score
is 98
and it is of type int (integer).
## ALT. display both value and type on the same line (as a tuple)
# score, type(score)
Let’s now look at the value and type of the variable average
:
average
77.5
type(average)
float
Exercise: try displaying the contents of the other variables, and their type
#@titlesolution
above_the_average
type(above_the_average)
message
type(message)
scores
type(scores)
list
Getting comfortable with Python#
Python is a “civilized” language, which means it provides lots of help tools to make learning the language easy for beginners. We’ll now learn about some of these tools including, “doc strings” (help menus) and introspection tools for looking at what attributes and methods are available to use.
This combination of tools allows programmers to answer common questions about Python objects and functions without leaving the JupyterLab environment. Basically, in Python all the necessary info is accessible directly in the coding environment. For example, at the end of this section you’ll be able to answer the following questions on your own:
How many and what type of arguments does the function
print
expect?What kind of optional, keyword arguments does the function
print
accept?What attributes and methods does the Python object
obj
have?What variables and functions are defined in the current namespace?
More than 50% of any programmer’s time is spent looking at help info and trying to understand the variables, functions, objects, and methods they are working with, so it’s important for you to learn these meta-skills.
Showing the help info#
Every Python object has a “doc string” associated with it, that provides the helpful information about the object.
There are three equivalent ways to view these docstring of any Python object obj
(value, variable, function, module, etc.):
help(obj)
: prints the docstring of the Python objectobj
obj?
SHIFT + TAB: while cursor on top of Python variable or function
There are also other methods for getting more detailed,
as part of the menu obj??
, %psource obj
, %pdef obj
,
but you won’t need this for now.
Example: learning about the print
function#
Let’s say you’re interested to know the options available for the function print
,
which we use to print Python expressions.
# put cursor in the middle of function and press SHIFT+TAB
print
<function print>
You know this function accepts a variable and prints it, but what other keywords arguments does it take?
Use the help() function on print
help(print)
Help on built-in function print in module builtins:
print(...)
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file: a file-like object (stream); defaults to the current sys.stdout.
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
Reading the doc string of the print
function suggests,
we see ...
for the type of inputs accepted,
which means print
accepts multiple arguments.
Here is an example that prints a string, an integer, a floating point number, and a list on the same line.
# print?
# print(print.__doc__)
Application: changing the separator when printing multiple values#
We can choose a different separator between arguments of the print
function
by specifying the value for the keyword argument sep
.
x = 3
y = 2.3
print(x, y)
3 2.3
print(x, y, sep=" --- ")
3 --- 2.3
Exercises#
Exercise 1: print the doc-string of the function len
.
#@titlesolution
help(len)
Help on built-in function len in module builtins:
len(obj, /)
Return the number of items in a container.
Exercise 2: print the doc-string of the function sum
.
#@titlesolution
help(sum)
Help on built-in function sum in module builtins:
sum(iterable, /, start=0)
Return the sum of a 'start' value (default: 0) plus an iterable of numbers
When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
Exercise 3:
print the doc strings of other functions we’ve seen so far: int
, float
, type
, etc.
#@titlesolution
# help(int)
# help(float)
# help(type)
(sidenote) Python comments#
You can write comment in Python code using the character #
.
Comments can be very useful to provide additional information that explains what the code is trying to do.
# this is a comment
You can also add longer, multi-line comments using triple-quoted text.
"""
This is a longer comment,
which is written on two lines.
"""
'\nThis is a longer comment,\nwhich is written on two lines.\n'
The docstrings we talked about earlier,
are exactly this kind of multi-line strings included in the source code of the functions len
, print
, etc.
Exercise 4: replace the ...
in the code block with comments that
explain the calculation “adding 10% tax to a purchase that costs $57”
that is being computer.
cost = 57.00 # ...
taxes = 0.10 * cost # ...
total = cost + taxes # ...
total # ...
62.7
#@titlesolution
cost = 57.00 # price before taxes
taxes = 0.10 * cost # 10% taxes = 0.1 times price
total = cost + taxes # add price + taxes and store the result in total
total # print the total
62.7
Inspecting Python objects#
Suppose you’re given the Python object obj
and you want to know what it is,
and learn what you can do with it.
Displaying the object#
There are several built-in functions that allow you to display information about the any object obj
.
type(obj)
: tells you what type of object it isprint(obj)
: converts the object tostr
and prints itrepr(obj)
: similar to print, but prints the complete string representation (including quotes).
The output ofrepr(obj)
contains all the information needed to reconstruct the objectobj
.
We’ve already used both type
and print
, so there is nothing new here.
I just wanted to remind you you can always use these functions as first line of inspection.
obj = 3
type(x)
int
print(obj)
3
repr(obj)
'3'
Auto-complete object attributes and methods#
JupyterLab notebook environment provides very useful “autocomplete” functionality that helps us look around at the attributes and methods of any object.
TAB
button: typeobj.
then press TAB button.dir(obj)
: shows the “directory” of all the attributes and methods of the objectobj
message = "Hello everyone"
# message. <TAB>
# message.upper()
# message.lower()
# message.split()
# message.replace("everyone", "world")
(bonus) See what’s in the global namespace#
In a Jupyter notebook,
you can run the command %whos
to print all the variables and functions that defined in the current namespace.
# %whos
Python error messages#
Sometimes the command you evaluate will cause an error, and Python will print an error message describing the problem it encountered. You need to be mentally prepared for these errors, since they can be very discouraging to see. The computer doesn’t like what you entered. The output is a big red box, that tells you your input is REJECTED!
Examples of errors include SyntaxError
, ValueError
, etc.
The error messages look scary,
but really they are there to help you—if you read what the error message is telling you,
you’ll know exactly what you need fix in your input.
The error message literally describes the problem!
Let’s look at an example expression that Python cannot compute, so it raises an exception. The code cell below shows an example error that occurs when you ask Pyhton to computer a math expressions that doesn’t make sense.
Let’s look at an example of an error:
3/0
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[45], line 1
----> 1 3/0
ZeroDivisionError: division by zero
You’ll see these threatening looking messages on a red background any time Python encounters an error when trying to run the commands you specified.
This is nothing to be alarmed by.
It usually means you made a typo (symbol not defined error),
forgot a syntax element (e.g. (
, ,
, [
, :
, etc.),
or tried to compute something impossible,
like dividing a number by zero as in the above example.
You get an error since you’re trying to compute an expression that contains a division by zero. Python tell you a ZeroDivisionError: division by zero
has occurred.
Indeed it’s not possible to divide a number by zero.
The way to read these red messages is to focus on the name of the explanation message that gets printed on the last line. The error message tells you what you need to fix. The solution will be obvious for typos and syntax errors, but for more complicated situations, you may need to search online to find what the problems is.
In our example, we can fix the 3/0
by replacing it with 3/1
.
3/1
3.0
Here is a list of the most common error messages you are likely to encounter:
SyntaxError
: you typed in something wrong (usually missing ” ] or some other punctuation)NameError
: Raised when a variable is not found in local or global scope.KeyError
: Raised when a key is not found in a dictionary.TypeError
: Raised when a function or operation is applied to an object of incorrect type.ValueError
: Raised when a function gets an argument of correct type but improper value.ZeroDivisionError
: Raised when the second operand of division or modulo operation is zero.later on we’ll also run into:
ImportError
andAttributeError
There are many more error types. You can see a complete list by typing in the command *Error?
.
Exercise: let’s break Python! 🔨🔨🔨🔨🔨🔨
Try typing in Python commands that would causes one of the above errors.
#@titlesolution
# SyntaxError
# [1
# NameError
# zz + 3
# TypeError
# sum(3)
# ValueError
# int("zz")
# ZeroDivisionError
# 5/0
# KeyError
# d = {}
# d["zz"]
# ImportError
# from math import zz
# AttributeError
# "hello".zz
Python documentation#
The official Python documentation website https://docs.python.org provides loads and loads of excellent information for learning Python.
Here are some useful links:
main page https://docs.python.org/3/
example (built in functions and types from Lessons 1 through 3)
print https://docs.python.org/3/library/functions.html#print
float https://docs.python.org/3/library/functions.html#float
str https://docs.python.org/3/library/functions.html#func-str
list https://docs.python.org/3/library/functions.html#func-list
dict https://docs.python.org/3/library/functions.html#func-dict
I encourage you to browse the site to familiarize yourself with the information that is available.
Usually when you do a google search, the official Python docs will show up on the first page of results. Make sure to prefer reading the official documentation instead of other “learn Python” websites (currently the first few google search results that show up point to SEO-optimized, spammy, advertisementfull, websites which are inferior to the official documentation). Always prefer the official docs (even if it appears lower in the list of results on the page). Stack overflow discussions are also a good place to find answers to common Python questions.
Lists and for loops#
Lists#
To create a list:
start with an opening square bracket
[
,then put the elements of the list separated by commas
,
finally close the square bracket
]
For example, scores
is a list of int
s that we’ve seen in several examples before.
scores = [61, 79, 98, 72]
scores
[61, 79, 98, 72]
A list container has a length,
which you can obtain by calling the len
function.
len(scores)
4
You check if a list contains a certain element using the in
operator:
98 in scores
True
List access syntax#
Elements of a the list are accessed using the square brackets [<index>]
,
where <index>
is the the 0
-based index of the element we want to access:
The first element has index
0
.The second element has index
1
.The last element has index equal to the length of the list minus one.
# first element in the list scores
scores[0]
61
# second element in the list scores
scores[1]
79
# last element in the list scores
scores[3]
72
Another way to access the last element in the list is to use the negative index -1
:
scores[-1]
72
List slicing#
We can access a subset of the list using the “slice” syntax a:b
,
which corresponds to the range of indices a
, a+1
, …, b-1
.
For example,
if we want to extract the first three elements in the list scores
,
we can use the slice 0:3
,
which is equivalent to requesting the range of indices 0
, 1
, and 2
.
scores[0:3]
[61, 79, 98]
Note the result of selecting a slice from a list is another list (a list that contains the subset of the original list that consists of elements whose index is included in the slice).
List methods#
List objects can be modified using a their methods. Every list has the following useful methods:
.sort()
: sort the list (in increasing order by default).append()
: add one element to end of a list.pop()
: extract the last element from a list.reverse()
: reverse the order of elements in the list
Let’s look at some examples of these methods in action.
To sort the list of scores
,
you can call its .sort()
method:
scores.sort()
scores
[61, 72, 79, 98]
To add a new element el
to the list (at the end),
use the method .append(el)
:
scores.append(22)
scores
[61, 72, 79, 98, 22]
The method .pop()
extracts the last element of the list:
scores.pop()
22
You can think of .pop()
as the “undo method” of the append operation.
To reverse the order of elements in the list,
call its .reverse()
method:
scores.reverse()
scores
[98, 79, 72, 61]
Other useful list methods: .insert(index,obj)
, .remove(obj)
,
and about a dozens more that might be useful once in a while.
Recall you can see a complete list of all the methods on list objects by typing scores.
then pressing the TAB button to trigger the auto-complete suggestions.
Uncomment the following code block,
place your cursor after the dot,
and try pressing TAB to see what happens.
# scores.
Exercise: The default behaviour of the method .sort()
is to
sort the elements in increasing order.
Suppose you want sort the elements in decreasing order instead.
You can pass a keyword argument to the method .sort()
to request the sorting be done in “revese” order (decreasing instead of increasing).
Consult the docstring of the .sort()
method to find the name of the keyword argument
that does this,
then modify the code below to sort the elements of the list scores
in decreasing order.
scores.sort()
scores
[61, 72, 79, 98]
#@titlesolution
# help(scores.sort)
scores.sort(reverse=True)
scores
[98, 79, 72, 61]
For loops#
We often want to repeat some operation (or several operations) once for each element in a list.
This is what the for
-loop is for.
The syntax of a for loop in Python looks like like this:
for <element> in <container>:
operation 1 using <element>
operation 2 using <element>
etc.
that allows to repeat a block of operations for each element <element>
in the list <container>
.
Example 1: print all the scores#
scores = [61, 79, 98, 72]
for score in scores:
print(score)
61
79
98
72
Example 2: compute the average score#
If \(\mathbf{x}\) is a list of values \([x_0,x_1,x_2,\ldots,x_{n-1}]\), the average of the list is defined as:
In words, the average value of a list of values is the sum of the values divided by the length of the list, which is \(\texttt{len}(\mathbf{x})\) in Python.
Let’s write a for
loop
that computes the sum (the total
) of the values in the list scores
.
We can then compute the average avg
by dividing the total
by the \(n\),
which is the length of the list.
total = 0
for score in scores:
total = total + score
avg = total / len(scores)
avg
77.5
Sidenote#
The name of the variable used for the for loop is totally up to you, but in general you should choose logical names for elements of the list. Below is an example of a for loop that uses the single-letter variable s
as the loop variable:
for s in scores:
print(s)
61
79
98
72
By conventions,
we usually call lists of obj
-items objs
,
and use the name obj
for the for-loop variable.
Here are some examples:
given a list of profiles
profiles
, use a the for-loopfor profile in profiles: ...
given a list of graph nodes
nodes
, use a the for-loop likefor node in nodes: ...
etc.
Anyone who is reading this Python code examples will
immediately know that Python objects profiles
and nodes
are list-like,
since they end in “s” and are used in for
loops.
List comprehension (bonus topic)#
Very often when programming, we need to transform a list of values, by applying the same operation to each value in a list of inputs, and collecting the results in a new list of outputs.
Using the standard for
-loop syntax,
this operation requires four lines of code:
newlist = []
for value in values:
newvalue = <some operation on `value`>
newvalues.append(newvalue)
This code start by creating an empty list called newlist
,
the uses a for
-loop to apply <some operation>
to each
element in the list values
, accumulating the results in the newlist
.
Python provides a shorthand syntax for writing operation as
newlist = [<some operation on `value`> for value in values]
This is called the “list comprehension” syntax,
and is used often in Python code.
Note the code using list comprehension takes one line to express the entire transformation
from values
to newlist
.
Example: compute the squares of the first five integers#
numbers = [1, 2, 3, 4, 5]
squares = [n**2 for n in numbers]
squares
[1, 4, 9, 16, 25]
# # ALT. using the `range` function
# numbers = range(1,6)
# squares = [n**2 for n in numbers]
# squares
Functions#
Functions! Finally we get to the good stuff! Functions are an important building block in programming, since they allow you to encapsulate any multi-step program or procedure as reusable piece of functionality. You can think of functions as reusable chunks of code that can be defined once, and used multiple times by “calling” them with different arguments.
Let’s do a quick review the concepts of a function in mathematics, since the Python syntax for functions is inspired by math functions. The convention in math to call function \(f\), denote the inputs of the function \(x\), and its outputs \(y\):
Note we defined the function \(f\) by simply writing some expression we need to compute, that depends on the input \(x\). For example \(f(x) = 2x+3\).
Python functions#
Functions in Python are similar to functions in math:
a Python function takes certain inputs and produces certain outputs.
We define the function called f
using the following syntax:
def f(x):
<steps to compute y from x>
return y
There are a lot of things going on in this code example, so let’s go over the code line-by-line and explain all the new elements of syntax:
The Python keyword
def
is used to declare a function definition.Next we see the name of the function
f
Next we must specify the inputs of the function (a.k.a. arguments) inside parentheses. In this example, the function
f
, it takes only a single argumentx
.The colon
:
indicates the beginning of a code block, which contains the the function body. The function body consists of one or more lines of code, indented by four spaces, just like the other types of code blocks we have seen.The final line of the function
f
, we use thereturn
statement to indicate the output of the function. The return statement is usually the last line in the function body.
Example 1#
A first example of a simple math-like function. The function is called f
,
takes numbers as inputs, and produces numbers as outputs:
def f(x):
return 2*x + 3
To call the function f
,
we use the function name,
the pass in the argument(s) of the function in parentheses.
f(10)
23
Example 2#
# TODO
Statistics functions#
Your turn to play with lists now!
Complete the code required to implement the functions mean
and std
below.
Question 1: Mean#
The formula for the mean of a list of numbers \([x_1, x_2, \ldots, x_n]\) is: $\( \text{mean} = \overline{x} = \frac{1}{n}\sum_{i=1}^n x_i = \tfrac{1}{n} \left[ x_1 + x_2 + \cdots + x_n \right]. \)$
Write the function mean(numbers)
: a function that computes the mean of a list of numbers
def mean(numbers):
"""
Computes the mean of the `numbers` list using a for loop.
"""
total = 0
for number in numbers:
total = total + number
return total / len(numbers)
mean([100,101])
100.5
# TEST CODE (run this code to test you solution)
from test_helpers import test_mean
# RUN TESTS
test_mean(mean)
All tests passed. Good job!
Question 2: Sample standard deviation#
The formula for the sample standard seviation of a list of numbers is: $\( \text{std}(\textbf{x}) = s = \sqrt{ \tfrac{1}{n-1}\sum_{i=1}^n (x_i-\overline{x})^2 } = \sqrt{ \tfrac{1}{n-1}\left[ (x_1-\overline{x})^2 + (x_2-\overline{x})^2 + \cdots + (x_n-\overline{x})^2\right]}. \)$
Note the division is by \((n-1)\) and not \(n\). Strange, no? You’ll have to wait until stats to see why this is the case.
Write compute_std(numbers)
: computes the sample standard deviation
import math
def std(numbers):
"""
Computes the sample standard deviation (square root of the sample variance)
using a for loop.
"""
avg = mean(numbers)
total = 0
for number in numbers:
total = total + (number-avg)**2
var = total/(len(numbers)-1)
return math.sqrt(var)
numbers = list(range(0,100))
std(numbers)
29.011491975882016
# compare to known good function...
import statistics
statistics.stdev(numbers)
29.011491975882016
# TEST CODE (run this code to test you solution)
from test_helpers import test_std
# RUN TESTS
test_std(std)
All tests passed. Good job!
Exercise 2#
Write a Python function called temp_C_to_F
that converts C to F
def temp_C_to_F(temp_C):
"""
Convert the temprate `temp_C` to Farenheit.
"""
pass # replace `pass` with your code
#@titlesolution
def temp_C_to_F(temp_C):
"""
Convert the temprate `temp_C` to Farenheit.
"""
temp_F = (temp_C * 9/5) + 32
return temp_F
temp_C_to_F(37.8)
100.03999999999999
Other data structures#
We already discussed list
s, which is the most important data structure (container for data) in Python.
In this section we’ll briefly introduce some other data structures you might encounter.
Strings#
String expressions#
Let’s look at some expressions that involve strings.
name = "julie"
message = "Hello " + name # for strings, + means concatenate
message
'Hello julie'
first_name = "Julie"
last_name = "Tremblay"
full_name = first_name + " " + last_name
message = "Hi " + full_name + "!"
message
'Hi Julie Tremblay!'
Strings are lists of characters#
You can think of the string "abc"
a being equivalent to a list of three characters ["a", "b", "c"]
,
and use the usual list syntax to access the individual characters in the list.
To illustrate this list-like behaviour of strings, let’s define a string of length 26 that contains all the lowercase Latin letters.
letters = "abcdefghijklmnopqrstuvwxyz"
letters
'abcdefghijklmnopqrstuvwxyz'
len(letters)
26
We can access the individual characters within the using the square brackets.
For example, the index of the letter "a"
in the string letters
is 0
:
letters[0]
'a'
The index of the letter "b"
in the string letters
is 1
:
letters[1]
'b'
The last element in list of 26 letters has index 25
letters[25]
'z'
Alternatively,
we can access the last letter using the negative index -1
:
letters[-1]
'z'
We can use slicing to get any substring that spans a particular range of indices. For example, the first four letters of the alphabet are are:
letters[0:4]
'abcd'
The syntax 0:4
is a shorthand for the expression slice(0,4)
,
which corresponds to the range of indices from 0
(inclusive) to 4
(non-inclusinve): [0,1,2,3]
.
Type conversions#
We sometimes need to convert between variables of different types. The functions for conversing types are have the same name as the type of an object:
int
: convert any expression into anint
float
: convert any expression into afloat
str
: convert an expression to its text representation.
Example: converting str
to float
#
type("42.5")
str
f = float("42.5")
f
42.5
type(f)
float
Exercise 6: compute the sum of two strings#
Suppose we’re given two numbers \(m\) and \(n\) and we want to compute their sum \(m+n\). The two numbers are given to use given expressed as strings.
mstr = "2.1"
nstr = "3.4"
print("The variable mstr has value", mstr, "and type", type(mstr))
print("The variable nstr has value", nstr, "and type", type(nstr))
The variable mstr has value 2.1 and type <class 'str'>
The variable nstr has value 3.4 and type <class 'str'>
Let’s try adding the two numbers together to see what happens…
mstr + nstr
'2.13.4'
This is because the addition operator +
for strings means concatenate, not add.
Python doesn’t know automatically that the two text strings are mean to be numbers.
We have to manually convert the strings to a Python numerical type (float
) first,
then we can add them together.
Write the Python code that converts the variables mstr
and nstr
to floating point numbers and add them together.
#@titlesolution
mfloat = float(mstr)
nfloat = float(nstr)
print("The variable mfloat has value", mfloat, "and type", type(mfloat))
print("The variable nfloat has value", nfloat, "and type", type(nfloat))
# compute the sum
mfloat + nfloat
The variable mfloat has value 2.1 and type <class 'float'>
The variable nfloat has value 3.4 and type <class 'float'>
5.5
Exercise write the Python code that converts a list of string variables prices_str
to floating point numbers and add them together.
prices_str = ["22.2", "10.1", "33.3"]
# write here the code that computes the total price
#@titlesolution
prices_str = ["22.2", "10.1", "33.3"]
prices_float = [float(price) for price in prices_str]
sum(prices_float)
65.6
Boolean variables and conditional statements#
Boolean variables can have one of two possible values, either True
or False
.
We obtain boolean values when we perform numerical comparisons.
x = 3
x > 2 # Is x greater than 2?
True
Other arithmetic comparisons include <
, >=
, <=
, ==
(equal to), !=
(not equal to).
The in
operator can be used to check if an object is part of a list (or another kind of collection).
x = 3
x in [1,2,3,4] # Is x in the list [1,2,3,4] ?
True
Boolean expressions are used in conditional statements, which are blocks of Python code that may or may not be executed depending on the value of a boolean expression.
Conditional statements#
Conditional control flow between code block alternatives.
if True:
print("This code will run")
if False:
print("This code will not run")
This code will run
x = 3
if x > 2:
print("x is greater than 2")
else:
print("x is less than or equal to 2")
x is greater than 2
We can do multiple checks using elif
statements.
temp = 25
if temp > 22:
print("It's hot!")
elif temp < 10:
print("It's cold!")
else:
print("It's OK.")
It's hot!
Exercise: add another condition to the above code to print It's very hot
if the temperature is above 30.
Boolean expressions#
You can use bool
variables and the logical operations and
, or
, not
, etc. to build more complicated boolean expressions (logical conjunctions, disjunctions, and negations).
True and True, True and False, False and True, False and False
(True, False, False, False)
True or True, True or False, False or True, False or False
(True, True, True, False)
x = 3
x >= 0 and x <= 10
True
x < 0 or x > 10
False
Exercise 1
The phase of water (at sea-level pressure = 1 atm = 101.3 kPa = 14.7 psi) ,
depends on its temperature temp
.
The three possible phases of water are "gas"
(water vapour), "liquid"
(water), and "solid"
(ice).
The table below shows the phase of water depending on the temperature temp
,
expresses as math inequalities.
temp range phase
--------------- -----
temp >= 100 gas
0 <= temp < 100 liquid
temp < 0 solid
Your task is to fill-in the if-elif-else
statement in the code cell below,
in order to print the correct phase string,
depending on the value of the variables temp
.
# temperature in Celcius (int or float)
temp = 90
# if ...:
# print(....)
# elif ...:
# print(....)
# else:
# print(....)
# uncomment the code if-elif-else statement above and replace:
# ... with conditions (translate math inequaility into Python code),
# .... with the appropriate phase string (one of "gas", "liquid", or "solid")
#@titlesolution
temp = 90
if temp >= 100:
print("gas")
elif temp >= 0:
print("liquid")
else:
print("solid")
liquid
Exercise 2
Teacher Joelle has computed the final scores of the students as a percentage (a score
out of 100). The final grade was computed as a weighted combination of the student’s average grade on the assignments, one midterm exam, and a final exam (more on this later).
The school where she teachers, requires her to convert each student’s score
to a letter grade, according to the following grading scale:
Grade Numerical score interval
A 85% – 100%
A- 80% – 84.999…%
B+ 75% – 79.999…%
B 70% – 74.999…%
B- 65% – 69.999…%
C+ 60% – 64.999…%
C 55% – 59.999…%
D 50% – 54.999…%
F 0% – 49.999…%
Write the if-elif-elif-…..-else statement that takes the score variable (an int between 0 and 100), and prints the appropriate letter grade for that score.
# student score as a percentage
score = 90
# if ...:
# print(....)
# elif ...:
# print(....)
# .....
# uncomment the code if-elif-.. statement above and replace:
# ... with the appropriate conditions,
# .... with the appropriate letter grades (strings), and
# ..... with additional elif blocks to cover all the cases.
#@titlesolution
score = 73
if score >= 85:
print("A")
elif score >= 80:
print("A-")
elif score >= 75:
print("B+")
elif score >= 70:
print("B")
elif score >= 65:
print("B-")
elif score >= 60:
print("C+")
elif score >= 55:
print("C")
elif score >= 50:
print("D")
else:
print("F")
B
Inline if statements (bonus topic)#
We can also use if-else keywords to compute conditional expressions. The general syntax for these is:
<value1> if <condition> else <value2>
This expressions evaluates to <value1>
if <condition>
is True,
else it evaluates to <value2>
when <condition>
is False.
temp = 25
msg = "It's hot!" if temp > 22 else "It's OK."
msg
"It's hot!"
Dictionaries#
When programming in Python,
one of the most commonly used data structures,
are dictionaries dict
,
and other dict
-like data structures.
A dictionary is an associate array between a set of keys
and a set of values
.
For example,
the code below defines dictionary d
that consists of the three keys-value pairs:
d = {"key1":"value1", "key2":"value2", "key3":"value3"}
d
{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}
d.keys()
dict_keys(['key1', 'key2', 'key3'])
d.values()
dict_values(['value1', 'value2', 'value3'])
You access the value in the dictionary using the square brackets syntax.
For example,
to see the value associated with key key1
in the dictionary d
,
we call:
d["key1"]
'value1'
You can change the value associate with any key by assigning a new value to it:
d["key2"] = "newval2"
d
{'key1': 'value1', 'key2': 'newval2', 'key3': 'value3'}
Exercise: creating a new profile dictionary#
Recall the profile
dictionary we created earlier:
profile = {"first_name":"Julie",
"last_name":"Tremblay",
"score":98}
Create a dictionary called profile2
with the same structure,
but for the user "Justin Trudeau"
with score 31
.
profile2 = {}
profile2["first_name"] = "Justin"
profile2["last_name"] = "Trudeau"
profile2["score"] = 31
profile2
{'first_name': 'Justin', 'last_name': 'Trudeau', 'score': 31}
Sets#
s = set()
s
set()
s.add(3)
s.add(5)
s
{3, 5}
3 in s
True
print("The set s contains the elements:")
for el in s:
print(el)
The set s contains the elements:
3
5
Tuples#
Tuples are similar to lists but with less features.
2,3
(2, 3)
(2,3)
(2, 3)
We can use the tuples syntax to assign to multiple variables on a single line:
x, y = 3, 4
We can also use tuples to “swap” two values.
# Swap the contexts of the variables x and y
tmp = x
y = x
x = tmp
# Equivalent operation on one line
x, y = y, x
Objects and classes#
All the Python variables we’ve been using until now are different kinds of “objects.” An object is a the most general purpose “container” for data, that also provides methods for manipulating this object.
In particular:
attributes: data properties of the object
methods: functions attached to the object
Example 1: string objects#
msg = "Hello"
type(msg)
str
# Uncomment the next line and press TAB after the dot
# msg.
# Attributes
# Methods:
msg.upper()
msg.lower()
msg.__len__()
msg.isascii()
msg.startswith("He")
msg.endswith("lo")
True
Example 2: file objects#
filename = "message.txt"
file = open(filename, "w")
type(file)
_io.TextIOWrapper
# Uncomment the next line and press TAB after the dot
# file.
# Attributes
file.name, file.mode, file.encoding
('message.txt', 'w', 'UTF-8')
# Methods:
file.write("Hello world\n")
file.writelines(["line2", "and line3."])
file.flush()
file.close()
Defining new types of objects#
Using the Python keyword class
can be used to define new kinds of objects.
Exercise: create a custom class of objects Interval
that represent intervals of real numbers like \([a,b] \subset \mathbb{R}\).
We want to be able to use the new interval objects in if
statements to check if a number \(x\) is in the interval \([a,b]\) or not.
Recall the in
operator that we can use to check if an element is part of a list
>>> 3 in [1,2,3,4]
True
we want the new objects of type Interval
to test for membership.
Example usage:
>>> 3 in Interval(2,4)
True
>>> 5 in Interval(2,4)
False
The expression x in Y
is corresponds to calling the method __contains__
on the container object Y
:
Y.__contains__(x)
and it will return a bool
ean value (True
or False
).
If we want to support checks like 3 in Interval(2,4)
we therefore have to implement
the method __contains__
on the Interval
class.
class Interval:
"""
Object that embodies the mathematical concept of an interval.
`Interval(a,b)` is equivalent to math interval [a,b] = {𝑥 | 𝑎 ≤ 𝑥 ≤ 𝑏}.
"""
def __init__(self, lowerbound, upperbound):
"""
This method is called when the object is created, and is used to
set the object attributes from the arguments passed in.
"""
self.lowerbound = lowerbound
self.upperbound = upperbound
def __str__(self):
"""
Return a representation of the interval as a string like "[a,b]".
"""
return "[" + str(self.lowerbound) + "," + str(self.upperbound) + "]"
def __contains__(self, x):
"""
This method is called to check membership using the `in` keyword.
"""
return self.lowerbound <= x and x <= self.upperbound
def __len__(self):
"""
This method will get called when you call `len` on the object.
"""
return self.upperbound - self.lowerbound
Create an object that corresponds to the interval \([2,4]\).
interval2to4 = Interval(2,4)
interval2to4
<__main__.Interval at 0x7fb558297160>
type(interval2to4)
__main__.Interval
str(interval2to4)
'[2,4]'
3.3 in interval2to4
True
1 in interval2to4
False
len(interval2to4)
2
Python libraries and modules#
Everything we discussed so far was using the Python built-in functions and data types,
but that is only a small subset of all the functionality available when using Python.
There are hundreds of Python libraries and modules that provide additional functions and data types
for all kinds of applications.
There are Python modules for processing different data files, making web requests, performing computations, etc.
The list is almost endless,
and the vast number of libraries and frameworks is all available to you behind a simple import
statement.
The three golden rules of software development:
Don’t write code because someone has already solved the problem you’re trying to solve.
Don’t write code because you can glue together one or more libraries to do what you want.
Don’t write code because you can solve your problem by using some subset of the functionality in an existing framework.
The import
statement#
We use the import
statement to load a python module and make it available in the current context.
The code below shows how to import the module <mod>
in the current notebook.
import <mod>
After this statement, we can now use the functions in the module <mod>
by calling them using the prefix <mod>.
,
which is called the “dot notation” for accessing within the namespace <mod>
.
For example, let’s import the statistics module and use the function statistics.mean
to compute the mean of three numbers.
import statistics
statistics.mean([1,2,6])
3
A very common trick you’ll see in Python notebooks, is to import python modules under an “alias” name, which is usually a shorter name that is faster to type.
The alias-import statement looks like this:
import <mod> as <alias>
For example, let’s import the statistics module under the alias stats
and repeat the mean calculation we saw above.
import statistics as stats
stats.mean([1,2,6])
3
As you can imagine,
if you’re writing some Python code that requires calling a lot of statistics calculations,
you’ll appreciate the alias-import statement,
since you call stats.mean
and stats.median
instead of having to type the
full module name each time, statistics.mean
and statistics.median
.
The standard library#
The Python standard library consists of several dozens of Python modules that come bundled with every Python installation.
Here are some modules that come in handy.
math
: math functions likesqrt
,sin
,cos
, etc.random
: random number generationsstatistics
: descriptive statistics computed from lists of values.re
: regular expressions (useful for matching patterns in strings)datetime
: manipulate dates and times.urllib.parse
: manipulate URLs (used for web programming).json
: read and write JSON files.csv
: read and write CSV files (see also Pandas, which can do this too)os
andos.path
: manipulate file system paths.sys
: access information about the current process and the operating system.
There are also a few libraries that are not part of the standard library, but almost as important:
requests
: make HTTP requests and download files from the internet.
Installing Python packages with pip
#
We use the command pip
or %pip
to install Python packages.
Scientific computing libraries#
NumPy#
Numerical Python (NumPy) is a library that provides high-performance arrays and matrices. NumPy arrays allow mathematical operations to run very fast, which is important when working with medium- and large- datasets.
Example: linspace
and other numerical calculations#
SciPy#
Scientific Python (SciPy) is a library that provides most common algorithms and special functions used by scientists and engineers. See https://scipy-lectures.org/
SymPy#
Symbolic math expressions. See sympy_tutorial.pdf.
Matplotlib#
Powerful library for plotting points, lines, and other graphs.
Examples: how to create reusable functions for plotting probability distributions#
plot_pdf_and_cdf
calc_prob_and_plot
calc_prob_and_plot_tails
Data science libraries#
pandas
library for tabular data (See pandas_tutorial.ipynb notebook)statsmodels
models for linear regression and otherseaborn
high-level library for statistical plots (See seaborn_tutorial.ipynb notebook).plotnine
another high-level library for data visualization base don the grammar of graphics principlesscikit-learn
tools and algorithms for machine learning
Bonus topics#
writing standalone scripts (
argparse
, example: turnhead
into script)functools.partial
for currying functions (e.g sample-generator callables)???generic functions
*args
and**kwargs
Reading and writing files https://python-textbok.readthedocs.io/en/1.0/Python_Basics.html#files
Writing standalone scripts#
For loop tricks#
Tricks:
enumerate
: provides an index when iterating through a list.zip
: allows you to iterate over multiple lists in parallel.
Using enumerate
to get for
-loop with index#
Use enumerate(somelist)
to iterate over tuples (index, value)
,
from a list of values from the list somelist
.
In each iteration, the index
tells you the index of the value
in the current iteration.
list(enumerate(scores))
[(0, 61), (1, 79), (2, 98), (3, 72)]
# example usage
for idx, score in enumerate(scores):
# this for loop has two variables index and score
print("Processing score", score, "which is at index", idx, "in the list")
Processing score 61 which is at index 0 in the list
Processing score 79 which is at index 1 in the list
Processing score 98 which is at index 2 in the list
Processing score 72 which is at index 3 in the list
Using zip
#
Use zip(list1,list2)
to get an iterator over tuples (value1, value2)
,
where value1
and value2
are elements taken from list1
and list2
,
in parallel, one at a time.
The name “zip” is reference to the way a zipper joins together the teeth of the two sides of the zipper when it is closing.
# example 1
list( zip([1,2,3], ["a","b","c"]) )
[(1, 'a'), (2, 'b'), (3, 'c')]
# example 2
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list(zip(list1, list2))
[(1, 4), (2, 5), (3, 6)]
# compute the sum of the matching values in two lists
for value1, value2 in zip(list1, list2):
print("The sum of", value1, "and", value2, "is", value1+value2)
The sum of 1 and 4 is 5
The sum of 2 and 5 is 7
The sum of 3 and 6 is 9
Functional programming helpers#
functools.partial
for currying functions (e.g sample-generator callables)
List-like objects = iterables#
The term “iterable” is used in Python to refer to any list-like object that can be used in a for
-loop.
Examples of iterables:
strings
dictionary keys, dictionary values, or dictionary (key,value) items
sets
range
(lazy generator for lists of integers)
range(0, 4)
range(0, 4)
list(range(0, 4))
[0, 1, 2, 3]
Iterating over dictionaries#
profile = {"first_name":"Julie", "last_name":"Tremblay", "score":98}
list(profile.keys())
['first_name', 'last_name', 'score']
# ALT.
list(profile)
['first_name', 'last_name', 'score']
list(profile.values())
['Julie', 'Tremblay', 98]
list(profile.items())
[('first_name', 'Julie'), ('last_name', 'Tremblay'), ('score', 98)]
We’ll talk more about dictionaries later on.
Converting iterables to lists#
Under the hood, Python uses all kinds of list-like data structures called iterables”. We don’t need to talk about these or understand how they work—all you need to know is they are behave like lists.
In the code examples above,
we converted several fancy list-like data structures into ordinary lists,
by wrapping them in a call to the function list
,
in order to display the results.
Let’s look at why need to use list(iterable)
when printing,
instead of just iterable
.
For examples,
the set of keys for a dictionary is a dict_keys
iterable object:
profile.keys()
dict_keys(['first_name', 'last_name', 'score'])
type(profile.keys())
dict_keys
I know, right? What the hell is dict_keys
?
I certainly don’t want to have to explain that…
… so instead, you’ll see this in the code:
list(profile.keys())
['first_name', 'last_name', 'score']
type(list(profile.keys()))
list
Generic function arguments#
functions with *args
and **kwargs
arguments
Final review#
Okay we’ve reached the end of this tutorial, so let’s to review of the new Python concepts we introduced in condensed form.
Python grammar and syntax review#
Learning Python is like learning a new language:
nouns: values of different types, usually referred to by name (named variables containing values)
verbs: functions and methods, including basic operators like
+
,-
, etc.grammar: rules about how to use nouns and verbs together
adverbs: keyword arguments (options) used to modify what a function does
These parts are easy to understand with time, since the new concepts correspond to English words, so you’ll get use to it all very quickly.
Python keywords#
Here is a list of keywords that make up the Python language:
False class finally is return
None continue for lambda try
True def from nonlocal while
and del global not with
as elif if or yield
assert else import pass
break except in raise
You’ve seen most of them, but not all of them. The ones you need to remember are:
if
,elif
,else
used in conditional statementsdef
used to define a new function and thereturn
statement that defines the output of the functionthe boolean values
True
andFalse
None
= special value that corresponds to no valueclass
for defining new object typesfor
for for loops and list-comprehension statementsimport ...
andfrom ... import ...
statements to import Python modulesor
and
andnot
in
to check if element is part of container
Python data types#
int
: naturals and integersfloat
: rational and real numberslist
: list of objects[obj1, obj2, obj3, ...]
bool
:True
orFalse
str
: text stringsdict
: associative array between keys and values{key1:value1, key2:value2, key3:value3, ...}
.tuple
: just like a list, but immutable (can’t modify it)set
: list-like object that doesn’t care about orderingNoneType
: Denotes the type of theNone
value, which describes the absence of a value (e.g. the output of a function that doesn’t return any value).complex
: complex numbers \(z=x+iy\)
Python built-in functions#
Essential functions:
print(arg1, arg2, ...)
: displaystr(arg1)
,str(arg2)
, etc.type(obj)
: tells you what kind of objectlen(obj)
: length of the object (only for:str
,list
,dict
objs)range(a,b)
: equivalent to the list[a,a+1,...,b-1]
help(obj)
: display info about the object, function, or method
Looking around, learning, and debugging Python code:
str(obj)
: display the string representation of the object.repr(obj)
: display the Python representation of the object. Usually, you can copy-paste the output ofrepr(obj)
into a Python shell to re-create the object.help(obj)
: display info about the object, function, or method. This is equivalent to calling object’s docstringobj.__doc__
.dir(obj)
: show complete list of attributes and methods of the objectobj
globals()
: display all variables in the Python global namespacelocals()
: display local variables (within current scope, e.g. local variables inside inside a function body)
Built-in methods used for lists:
len(obj)
: length of the object (only for:str
,list
,dict
objs)sum(li)
: sum of the values in the list of numbersli
all(li)
: true if all values in the listli
are trueany(li)
: true if any of the values in the listli
are trueenumerate(li)
: convert list of values toli
to list of tuples(i,li[i])
(use in for loop asfor i, item in enumerate(items):...
.zip(li1, li2)
: joint iteration over two listsLow-level iterator methods:
iter()
andnext()
(out of scope for this tutorial. Just know that every time I said list-like, I meant “any object that implements the iterator and iterable protocols).
Input-output (I/O):
input
: prompt user for input, returns the value user entered as a sting.print(arg1, arg2, ...)
: displaystr(arg1)
,str(arg2)
, etc.open(filepath,mode)
: open the file atfilepath
formode
-operations. Usemode="r"
for reading text from a file, andmode="w"
for writing text to a file.
Advanced stuff:
Functional shit:
map()
,eval()
,exec()
Meta-programming:
hasattr()
,getattr()
,setattr()
Object oriented programming:
isinstance()
,issubclass()
,super()
Python punctuation#
The most confusing part of learning Python is the use of non-word punctuation characters, which have very specific meaning that has nothing to do with English punctuation marks.
Let’s review how the symbols =([{*"'#,.:
are used in various Python expressions.
The meaning of these symbols changes depending on the context.
Here is a complete, no-holds-barred list of the punctuation marks usage in the Python programming language. Like literally each of them. This list of symbols uses will help us close the review, since it reviews the Python syntax was used in all the sections in this tutorial.
Equal sign
=
assignment
specify default keyword argument (in function definition)
pass values for keyword arguments (in function call)
Round brackets
()
are used for:calling functions
defining tuples:
(1, 2, 3)
enforcing operation precedence:
result = (x + y) * z
defining functions (in combination with the
def
keyword, e.g.def f(x): ...
)defining class
creating object
Curly-brackets (accolades)
{}
define dict literals:
mydict = {"k1":"v1", "k2":"v2"}
define sets:
{1,2,3}
Square brackets
[]
are used for:defining lists:
mylist = [1, 2, 3]
list indexing:
ages[3] = 29
dict access by key:
mydict["k1"]
(used by__getitem__
or__setitem__
)list slice:
mylist[0:2]
(first two items inmylist
)
Quotes
"
and'
define string literals
note raw string variant
r"..."
also exists
Triple quotes
"""
and'''
long string literals entire paragraphs
Hash symbol
#
comment (Python ignores text after the
#
symbol)
Colon
:
syntax for the beginning of indented block.
The colon is used at the end of statements likeif
,elif
,else
for
, etc.key: value separator in dict literals
slice of indices
0:2
(first two items in a list)
Period
.
decimal separator for floating point literals
access object attributes
access object methods
Comma
,
element separator in
list
s andtuple
skey:value
separator when creating adict
separate function arguments in function definitions
separate function arguments when calling functions
Asterisk
*
multiplication operator
(advanced) unpack elements of a list
Double asterisk
**
exponent operator
(advanced) unpack elements of a dict
Semicolon
;
(rarely used)(advanced) put multiple Python commands on single line
Don’t worry if you didn’t understand all the use cases listed. I’ve tried to make the list complete, so I’ve included some more advanced topics, labeled (advanced), which you’ll learn about over time when you use Python.
Discussion#
Let’s go over some of the things we skipped in the tutorial, because they were not essential for getting started. Now that you know a little bit about Python, it’s worth mentioning some of these details, since it’s useful context to see how this “Python calculator” business works. I also want to tell you about some of the cool Python applications you can look forward to if you choose to develop your Python skills further.
Applications#
Python is not just a calculator. Python can also be used for non-interactive programs and services. Python is a general-purpose programming language so it enables a lot of applications. The list below talks about some areas where Python programming is popular.
command line scripts: you can put commands line scrips are written in Python, then run them on the command line (terminal on UNIX or or
cmd.exe
on Windows). For example, you can download any video from YouTube by running the commandyoutube-dl <youtube_url>
. If all you want is the audio, you can use some command-line options to specifyyoutube-dl --extract-audio --audio-format mp3 <youtube_url>
to extract the audio track from the youtube video and save it as an mp3. The author uses this type of command daily to make local copies of songs to listen to them offline.graphical user interface (GUI) programs: many desktop applications are written in Python. An example of a graphical, point-and-click application written in Python is
Calibre
, which is a powerful eBook management library and eBook reader and eBook converter, that supports all imaginable eBook formats.web applications: the Django and Flask frameworks are often used to build web applications. Many of the websites you access every day have as server component written in Python.
machine learning systems: create task-specific functions by using probabilistic models instead of code. Machine learning models undergo a training stage in which the model parameters are “learned” from the training data examples, after which the model can be queried to make predictions.
I mention these examples so you’ll know the other possibilities enabled by Python, beyond the basic “use Python interactively like a calculator” code examples that we saw in this tutorial.
There is a lot of other useful stuff. We’re at the end of this tutorial, but just the beginning of your journey to discover all the interesting thins you can do with Python.
Python programming#
Coding a.k.a. programming, software engineering, or software development is a broad topic, which is out of scope for this short tutorial. If you’re interested to learn more about coding, see the article What is code? by Paul Ford. Think mobile apps, web apps, APIs, algorithms, CPUs, GPUs, TPUs, SysOps, etc. There is a lot to learn about applications enabled by learning basic coding skills, it’s almost like reading and writing skills.
Learning programming usually takes several years, but you don’t need to become a professional coder to start using Python for simple tasks, the same way you don’t need to become a professional author to use writing for everyday tasks. If you reached this far in the tutorial, you know enough about basic Python to continue your journey.
In particular, you can read the other two tutorials that appear in the No Bullshit Guide to Statistics:
Pandas (see pandas_tutorial.ipynb)
Seaborn (see seaborn_tutorial.ipynb)
Learning objectives#
In this tutorial you’ll learn the following specific Python skills, which are required for probability and statistics calculations.
know how to define function (e.g.
fH
for Example 3 in Section 2.1)understand list comprehension (e.g.
[fH(h) for h in range(0,5)]
for Example 3 in Section 2.1)know built-in functions:
sum
len
range
know the general pattern for plotting the graph of function
f
usingnumpy
arrays:xs = np.linspace OR np.arange OR list
ys = f(xs)
(Call
plt.stem(ys)
orsns.lineplot(x=xs, y=ys)
(optional) understand when we need to use
vectorize(f)
Robyn recommends the following concepts to be covered:
Quotes around text (strings)
Assigning variables
Interpreting error messages
help(method)
Syntax for writing and running functions
definitions/explanations for the following (all used in the DATA chapter): ‘None’, ‘0-based indexing’, ‘attributes’, ‘methods’, ‘object’, ‘instance’, ‘class,’ ‘module,’ ‘accessing columns as attributes’
Specific requirements from PROB chapter:
range and summation
Specific requirements from STATS chapter:
Specific requirements from LINEAR MODELS chapter:
Links#
I’ve collected the best learning resources for Python, which you can use to learn more about Python.
Python cheatsheets#
Introductions and tutorials#
Python tutorial by Russell A. Poldrack
https://statsthinking21.github.io/statsthinking21-python/01-IntroductionToPython.htmlProgramming with Python by Software Carpentry team:
https://swcarpentry.github.io/python-novice-inflammation/Official Python tutorial:
https://docs.python.org/3.10/tutorial/Python glossary:
https://docs.python.org/3.10/glossary.html#glossaryNice tutorial:
https://www.pythonlikeyoumeanit.com/Python data structures
https://devopedia.org/python-data-structuresFurther reading
rasbt/python_referencehttps://walkintheforest.com/Content/Introduction+to+Python/🐍+Introduction+to+Python
Online tutorial
https://www.kaggle.com/learn/pythonComplete list of all the Python builtins
https://treyhunner.com/2019/05/python-builtins-worth-learning/
via https://news.ycombinator.com/item?id=30621552Video lectures from Pythong course by Chelsea Parlett Pelleriti https://www.youtube.com/playlist?list=PLmxpwhh4FDm460ztGwXmIUcGmfNzf4NMW
Special topics#
Stats-related python functions
https://www.statology.org/python-guides/Python types (
int
s,float
s, andbool
s)
anthony-agbay/introduction-to-pythonPython string operations
anthony-agbay/introduction-to-pythonScientific computing
https://devopedia.org/python-for-scientific-computing
Books#
Python book for beginners (discussed here)
https://learnpythontherightway.com/Object-Oriented Programming in Python
https://python-textbok.readthedocs.io/en/1.0/index.html
Incoming: