# Jupyter Notebook, `numpy` and `matplotlib`

In this notebook, we will begin to explore the tools that make up the Scientific Python (SciPy) "stack" or "ecosystem". The fundamental ones are the **IPython shell** or **Jupyter Notebook**, the **numpy** package for arrays, and the **matplotlib** scientific plotting package.

# Jupyter Notebook (formerly IPython Notebook)

This document is a Jupyter notebook document, which is an interactive computational document that can be modified and executed in the Jupyter Notebook web application, a fantastic way to produce computational workflows that mix rich text, Python (and other languages) code, graphics, and other multimedia.

The structure of a notebook document is based on **cells**, which are executed using `Shift-Enter` or `Ctrl-Enter`. These cells have different types e.g.: `markdown` for test, images and LaTeX formulas, `code` for lines of code (and its graphical output) which can be executed within the cell. Each cell is in either of the two modes: **Edit mode** allows you to type code or text into a cell and is indicated by a green cell border (press `Enter` for edit mode). **Command mode** binds the keyboard to notebook level commands (type `h` for help on keyboard shortcuts) and is indicated by a grey cell border with a blue left margin (press `Ctrl-Enter` to enter the command mode.

# Mathematical functions and data-structures: `numpy`

Let's start with a simple scientific problem: just visualizing some data, e.g. the function $sin: \mathbb{R}\rightarrow[-1,1]$, $y=\sin(x)$, pour $x\in[-2\pi,2\pi]$

In [None]:
import numpy as np

Let's create some data by evaluating a function, `sin`

Replace `range` (of Python) by `arange` (numpy):

In [None]:
range?

In [None]:
help(np.arange)

In [None]:
np.arange(-2.*pi, 2.*pi, 0.1)

In [None]:
x = np.arange(-2.*np.pi, 2.*np.pi, 0.1)

NB: We could import the functions that we will use with `from numpy import sin` etc., but for the purposes, at least, of this tutorial, it is more useful to keep track of where they live.

## What kind of object is `x`?

In [None]:
type(x)

This is a new type (class), defined in `numpy`, that represents an $n$-dimensional array (hence the name).

As any object in Python, `x` has properties, i.e. internal variables that belong to it, and methods, i.e. functions that act on it. IPython allows us to discover these properties (*introspect*) using the `TAB` key, by typing `x.<TAB>`.

--------

**Exercise**: Before reading on, play around with `x`; try to find out information about it and do things with it by investigating its properties and methods.

---------

In [None]:
x.

Some examples of useful things we can do with `x`:

In [None]:
x.shape  # size in each of the dimensions of x; returns a tuple

In [None]:
x.sum()  # sum of the elements of x, perhaps a surprising result!

An alternative way of generating ranges of numbers is using `np.linspace`:

In [None]:
x = np.linspace(-2.*np.pi, 2.*np.pi, 31)

In [None]:
x.sum()  

This is more in line with our symmetry expectations!

In [None]:
x

We now wish to evaluate the `sin` function in each of these values of $x$. Standard Python provides us with several ways of doing this. Two of the neatest are "functional" approaches: list comprehensions and the `map` function. [In Python 3, `map` no longer returns a list.]

In [None]:
s1 = [np.sin(xx) for xx in x]
s2 = map(np.sin, x)

However, we have now left the `numpy` world:

In [None]:
type(s1), type(s2)

Of course, it's easy to get back: we can pass a list to the `np.array` function to convert it into a `numpy` array:

In [None]:
s3 = np.array(s1);

However, this is inefficient, since we have created an intermediate copy of all the data.

Instead, `numpy` (and `scipy` etc.) are based around *vectorized* versions of functions. These operate on a `numpy` array and return a new array of the same type:

In [None]:
y = np.sin(x)
y

What is `np.sin`?

In [None]:
np.sin

It is a "ufunc", i.e. a "universal function", that knows how to operate both on floating-point numbers and on complete arrays.

The idea of `numpy` is that we should think in a vectorized way, always trying to operate in this way on complete arrays as a unit. In this way, we can forget about the internal implementation details, which, in fact, could be rather complicated these days, when multicore machines and GPUs are involved. We just specify the desired result, and let somebody else do the hard work of figuring out what's going on underneath to make it all happen!

## Arithmetic on vectors

`numpy` also, of course, provides the missing vector, matrix etc. functionality that does not come in standard Python.

At first glance, Python lists seem like they should work find as vectors:

In [None]:
v = [3., 4., 5.]
v

But we quickly find out that that's not the case:

In [None]:
v + v

In [None]:
2 * v

In [None]:
3.5 * v

`numpy` is designed to provide this missing functionality via its `ndarray` type. The *constructor*, i.e. the function which creates objects of this type, is the function `np.array`. It accepts lists (and other iterables) and creates a version as an `ndarray`:

In [None]:
x = np.array( [1., 2., 3.] )  # careful with the order of the parentheses and the square brackets!
y = np.array( [3., 4, 5] )

In [None]:
print(x)

Arithmetic operations now just work!:

In [None]:
x + y

In [None]:
(x + y).dtype

In [None]:
-3.5 * x

We can, for example, use complex numbers:

In [None]:
z = 3j * x
z

In [None]:
z.dtype  # "datatype"

We see that `numpy` arrays have associated a data type: all elements must be of the same type (unlike in standard Python lists). This is for efficiency purposes.

-----------

**Exercise**: Play around with arithmetic and mathematical operations on `numpy` arrays. See the Cheat sheet for numpy.

-----------

## Conclusion: What is `numpy`?

`numpy` thus can be thought of as providing two things:

- a numerical array type, suitable for scientific data (vectors, matrices, etc.)
- vectorized operations for operating on arrays

-----------

# Visualizing it: `matplotlib`

Now we have some data, that could have come from an experiment, or a social network, etc., stored in a `numpy` array. The next thing to do is - obviously - to visualize it to see what's going on.

The standard scientific graphing package, which forms part of the core SciPy stack, is `matplotlib`, so called because it was originally designed to have a MATLAB compatibility layer. This easy-to-use layer is called `pyplot`, and it is standard to import it with the name `plt`:

In [None]:
import matplotlib.pyplot as plt

-------

**Exercise**: Investigate the contents of the `pyplot` submodule. Type `plt.<tab>`.

-------

In [None]:
plt.

In order to include the graphical output into our Jupyter Notebook, issue one of the following internal IPython directives (magical functions, magics):
`%matplotlib inline` (includes static images), or `%matplotlib notebook` (includes interactive images):

In [None]:
%matplotlib notebook

Generate some data for visualisation (recap) and make the plot

In [None]:
import numpy as np
x = np.linspace(-2.*np.pi, 2.*np.pi, 101)
y = np.sin(x)

In [None]:
plt.figure(1)
figobj=plt.plot(x, y)

Several plot commands can be included in the same graph (above). Plot styles may be changed; axis labels and legends may be added.

In [None]:
from numpy import sin, cos

In [None]:
plt.figure(2)
plt.clf()
pltobj1=plt.plot(x, sin(x))
pltobj2=plt.plot(x, cos(x))

In [None]:
plt.plot(x, np.sin(x), label="sin(x)")
plt.plot(x, np.cos(x), "ro--", label="$\cos(x)$")  # red points and dashed lines
plt.xlabel("$x$", size=20)
plt.ylabel("$\sin(x), \cos(x)$", size=18)
lgnd1=plt.legend()

`matplotlib` includes a subset of LaTeX notation. The labels may also be processed by LaTeX if required.

The graph may be saved using

In [None]:
plt.savefig("sincos.pdf")

In [None]:
%pwd

Help may be obtained in IPython using `?`. The documentation for `plot` includes full details of the options for point and line types, colours, etc.

In [None]:
plt.plot?

`matplotlib` requires iterables (`numpy` arrays, Python lists, etc.) as its first two arguments, giving the $x$- and $y$-coordinates of the points to plot.

-----
**Exercise**: Plot the exponential and logarithm functions, together with a dashed line showing $y=x$ to show the symmetry.

Use `plt.axis("equal")` to get the correct aspect ratio.

-----

## Conclusion: What is `matplotlib`

`matplotlib`, then, provides
- high-quality 2D (and some 3D) graphs
- inline plotting in the Jupyter notebook
- publication-quality exported PDFs, SVGs, PNGs

-----------


# Adding interactivity with the Jupyter Notebook

Using interactive widgets brings a whole new dimension to scientific exploration and visualization.

We first import the widgets functionality:

In [None]:
import ipywidgets as widgets

The easiest way to use the new functionality is via the powerful `interact` command. This takes a Python function and provides widgets (sliders, buttons, etc.) to modify the arguments of the function:

In [None]:
def f(a):
    print(a**2)

In [None]:
wi1=widgets.interact(f, a=(-5, 5, 0.1))

As the slider is moved, the output changes accordingly.

By including a plot inside the function, we obtain interactive plots controlled by sliders:

In [None]:
def sincos(a, b):

    plt.figure(3)
    plt.cla()
    x = np.linspace(-2.*np.pi, 2.*np.pi, 101)
    plt.plot(x, a*np.sin(x+b), label="$a \sin(x) + b$")
    plt.plot(x, np.cos(x), "ro--", label="$\cos(x)$")  # red points and dashed lines
    plt.xlabel("$x$", size=20)
    plt.ylabel("$\sin(x), \cos(x)$", size=20)
    plt.legend()
    
    plt.xlim(-2.*np.pi, 2.*np.pi)
    plt.ylim(-2, 2)



In [None]:
sincos(1, 2)

In [None]:
wi3=widgets.interact(sincos, a=(-5, 5, 0.1), b=(-5, 5, 0.1))

--------

**Exercise**: Explore interactivity using your favourite mathematical functions.

--------

# Using data with `numpy`

Naturally, many applications of scientific computing do not manufacture data from functions or obtain it from numerical simulations, but rather use pre-existing data.

`numpy` has functions to save and load data from and to arrays. More extensive functionality is available in the `pandas` package.

Let's start by creating some artificial data with random noise from the `numpy.random` submodule of `numpy`:

In [None]:
import numpy.random as random

`random.rand()` generates a (pseudo-)random number between 0 and 1:

In [None]:
random.rand()

In [None]:
x = random.rand(5)
x

Giving it an argument generates a vector of that length.

In [None]:
x = np.linspace(-5, 5, 15)
x += 0.5*random.rand(len(x))
y = x**2 + 5*random.rand(len(x))

One can create also matrices:

In [None]:
random.rand(4,4)

In [None]:
plt.figure(4)
plt.clf()
pltobj4=plt.plot(x, y, 'o')

## Matrices in `numpy`

To export the data, we need it in a tabular form in a matrix, which is represented in `numpy` as an array formed as a list of lists. Let's combine the two arrays into a matrix by stacking them. There's a special syntax to do this:

In [None]:
# np.concatenate( (x, y), axis=2 )

In [None]:
z = np.c_[x, y]
z.shape

### Saving a matrix into a text file

In [None]:
np.savetxt?

In [None]:
np.savetxt("noisy_quadratic.dat", z)

In [None]:
%cat noisy_quadratic.dat  # %cat is an IPython command to show the contents of a file

### Better formatting of the text file

`np.savetxt` enables the specification of a format string for output:

In [None]:
np.savetxt?

In [None]:
np.savetxt("noisy_quadratic.dat", z, fmt="%20.5f %20.5f")

In [None]:
%cat noisy_quadratic.dat

### Loading data from a text file

Similarly, `numpy` has a relatively simple function `loadtxt` for loading tabulated data:

In [None]:
np.loadtxt?

In [None]:
zz = np.loadtxt("noisy_quadratic.dat")

In [None]:
zz

Now we need to split the array `z_new` up into columns. The syntax for accessing elements of an array is like that of a list in Python: the first element of an array `x` is `x[0]`, the second `x[1]`, etc.

In [None]:
x

In [None]:
x[3]

In [None]:
x[3:10]

In [None]:
z

In [None]:
z[3]

In [None]:
z[3:5]

In [None]:
z[3:5, 1]

In [None]:
z[:, 0]

----

**Exercise**: What do the elements of the array `z_new` look like? Play with the standard colon (`:`) notation for ranges to access elements of the matrix. How can you extract the first column of the matrix?

----

Individual elements of the matrix are accessed via

In [None]:
zz[1, 0]

Thus `zz` behaves as if it were a mathematical matrix: the first entry specifies the row and the second the column, except that indexing always starts at 0.

Ranges of elements are specified as with Python lists, using `:`, available for specifying ranges in both directions:

In [None]:
zz[1:3, 0:2]

Ranges that span the the whole direction may be specified just with the colon:

In [None]:
zz[1:3, :]

Thus, the first column is

In [None]:
zz[:,1]

Thus finally, we can set

In [None]:
xx = zz[:, 0]
yy = zz[:, 1]

In fact, there is a simple trick to obtain this: we may transpose the matrix `zz`:

In [None]:
zz.T

In [None]:
zz.T

In [None]:
zz.strides

In [None]:
zz.T.strides

and then use tuple unpacking:

In [None]:
xx, yy = zz.T


Furthermore, `np.loadtxt` has a keyword option that allows us to do all of this in one step:

In [None]:
xxx, yyy = np.loadtxt("noisy_quadratic.dat", unpack=True)

In [None]:
np.loadtxt?

----
**Exercise**: Load and plot some of your favourite data.

----

# Literate programming in the Jupyter Notebook

We can do **literate programming** by mixing textual commentary with our code and calculations. To do so, we use a lightweight **markup language**, called *Markdown*.

Markdown allows us to write *italics* using `*italics*` and **bold text** using `**bold text**`.

It allows
- bulleted lists, using `-` at the start of each line
- include images using HTML tags `<img src="image.png">`
- quoted passages, by using `>`: 


1. Numbered lists
2. A quote follows:

> All the world is a stage.  

Significantly for scientific computing, within Markdown cells, we may *use equations, written using (a subset of) LaTeX*, by enclosing the equations in `$` for inline maths, or `$$` for displayed equations:

The equation $y = x^2$ is very interesting. A more complicated one is
$$\frac{dx}{dt} = C e^{-x^2}$$.

In order to render these equations, the MathJax library is used. An online version is used by default; if you require a local version, it must be downloaded by typing the following in a Python session:

    from IPython.external import mathjax
    mathjax.install_mathjax()

-----
**Exercise**: Write and illustrate a short segment on your favourite function(s).

-----

### Exporting your notebook document

When you save a Jupyter notebook document, it is saved in the current working directory (available with `pwd()`) with the extension `.ipynb`. This file is a plain text file, in a certain format called JSON ("Javascript Object Notation"), which contains a *complete* record of *exactly* the contents of the notebook in your environment. This file may be saved in a version control system, sent to a colleague, etc.

Another important feature of Jupyter notebook is the ability to export, or convert, the notebook format to other types of format, including LaTeX and PDF documents, HTML, reStructured Text, and even slide shows.

Several of these options are available from the `File | Download as` option from within the Notebook itself. For the other formats, and more control, we use the `jupyter nbconvert` command-line tool.

IPython provides access to the command line with the `!` operator. We can convert to a static HTML, e.g. suitable for a blog post:

In [None]:
!jupyter nbconvert practiceJupyterNumpyMatplotlib.ipynb --to html

In [None]:
!firefox practiceJupyterNumpyMatplotlib.html

There is also an option to generate a slideshow presentation using the `RISE` plugin based on `Reveal.js`. To use this option, choose the `Slideshow` option from the `Cell Toolbar`, and specify the types for each cell using the resulting popup boxes. Then press `Alt-R` to enter the presentation mode. Or, one can convert the Jupyter Notebook to slides by this convert command:

In [None]:
!jupyter nbconvert practiceJupyterNumpyMatplotlib.ipynb --to slides --post serve

## Conclusion: What is the Jupyter Notebook?

The Jupyter Notebook provides the necessary tools for a highly interactive dialogue with code, with mathematical commentary about the operations being performed.

In this way, it is a wonderful tool for documenting computational workflows.

-----
**Exercise**:  Log your numerical experimenting with `numpy` using Jupyter Notebook.

-----

# Summary

In this notebook, we have seen the basic usage of the fundamental components of the Scientific Python stack:

- `The Jupyter Notebook`
    - an environment for facilitating computational workflows
    - provides interactivity via widgets
    - allows us to export notebook documents in different formats
    
- `numpy`
    - provides vectors, matrices, and $n$-dimensional arrays
    - provides fast vectorized mathematical operations on arrays
    - easy input and output of data
    
- `matplotlib`
    - 2D (and 3D) interactive plots
    - many formatting options
    - publication-quality output
