{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "(sec:scientific:libraries)=\n", "# Scientific Libraries with Python\n", "\n", "\"Open\n", "\n", "Although coding with Python is very versatile and allows many advanced features that are useful when manipulating massive data (a common task in science), Python is still a multipurpose language, what implies that scientific routines and functions cannot (should not) be supported within its basic core. Nevertheless, there are many different scientific libraries that can extend the capabilities of Python to scientific implementations in a natural way. One of the most used libraries is NumPy. This introduce the array object as a generalization of Python nested lists. This new object has many linear algebra operations implemented as methods or attributes. This operations are implemented at the low level through fast highly optimized algorithms. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[](sec:scientific:libraries)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Python+NumPy\n", "In this way, Python+NumPy can be seen as a framework to implement numerical code as linear algebra abstractions which replaces the slow Python loops. The contrary is also true. In this framework each algorithm must be designed to try to avoid the use of Python loops like `for` or `while`.\n", "\n", "An ideal program implemented in Python+NumPy does not have explicit Python loops, but only linear algebra abstractions. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Extended Python+Numpy framework\n", "All the other high level packages for scientific computation are designed to work around the linear algebra abstractions of the Python+NumPy framework\n", "* Pandas add labels to the Numpy arrays.\n", "* Mathplotlib plotting library. Ti visualize arrays in 1, 2 and 3 dimensions\n", "* SciPy, intended for manipulating NumPy arrays more efficiently and for extending and including numerical methods, respectively. \n", "\n", "\n", "Another less used libraries like SymPy are intended for manipulating analytical expressions, i.e. a CAS (Computer Algebraic System).\n", "\n", "\n", "Installation of these libraries is often an easy task. In most of the Linux distros you should find them in the official repositories.\n", "\n", "Avoid loops. Use abstractions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Official Pages\n", "\n", "See the official pages of the libraries for new versions, news and manuals.\n", "\n", "**NumPy:**\n", "\n", "[http://www.numpy.org/](http://www.numpy.org/)\n", "\n", "**SciPy**\n", "\n", "[http://www.scipy.org/](http://www.scipy.org/)\n", "\n", "**SymPy**\n", "\n", "[http://www.sympy.org/](http://www.sympy.org/)\n", "\n", "**Anaconda**\n", "\n", "[https://www.anaconda.com/](https://www.anaconda.com/)\n", "\n", "Anaconda is a self-cointained Python distribution that integrates many standard scientific libraries with Python, along with some generic libraries like MongoDB\n", "\n", "\n", "There are many different scientific libraries for Python with many different uses, even for very specific tasks. However, as we are interested in general numerical methods, we will focus only on NumPy and Scipy\n", "\n", "La forma recomendada de importar los diferentes módulos y el uso de sus métodos y atributos suele resumirse en _Cheat Sheets_. Para Python científico recomendamos las elaboradas por [Data Camp](https://learn.datacamp.com/), que pueden consultarse [aquí](https://drive.google.com/drive/folders/1jt_fDBA8GneCVVH874Th5_491Jc6GYXJ?usp=sharing)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```{contents}\n", ":depth: 2\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- - - " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", " \n", "
\n", "\n", "## NumPy\n", "\n", "NumPy is the fundamental package for scientific computing with Python. It contains among other things:\n", "\n", "* a powerful N-dimensional array object\n", "* sophisticated (broadcasting) functions\n", "* tools for integrating C/C++ and Fortran code\n", "* useful linear algebra, Fourier transform, and random number capabilities\n", "\n", "Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.\n", "\n", "[NumPy Cheat Sheet](https://drive.google.com/file/d/19ISIYIR_0j9LAEUan23ircGDU0ygtQKq/view?usp=sharing) [PDF] " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Basic Use" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "tags": [] }, "source": [ "#### Importing and basic math\n", "\n", "NumPy can be imported in several different ways. Importing NumPy with the alias of np is the recommended way" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The stantard name space for other Python modules are\n", "```python\n", "import numpy as np\n", "import pandas as pd\n", "import scipy as sp\n", "import matplotlib.pyplot as plt\n", "import numpy.linalg as la\n", "import math as m # real \n", "import cmath as cm # complex\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "If imported through the name space `np`, NumPy methods and attributes are accessed by using the assigned name space.\n", "\n", "The Basic methods of NumPy include the usual mathematical functions" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "36315.502674246636 3.9569963710708773 1.8055008581584002 3.1622776601683795\n" ] } ], "source": [ "print( np.exp(10.5), np.log(52.3), np.log10(63.9), np.sqrt(10.0) )" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-0.9589242746631385 -0.984687855794127 0.5235987755982989 1.373400766945016\n" ] } ], "source": [ "#Trigonometric functions\n", "print (np.sin(5.0), np.cos(9.6), np.arcsin(0.5), np.arctan(5))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The basic attributes include some important constants" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'The value of PI is 3.14'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"The value of PI is {:.2f}\".format( np.pi )" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The value of PI is 3.141593\n" ] } ], "source": [ "print ( \"The value of PI is {:.6f}\".format( np.pi ) )" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.141592653589793" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.pi" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The value of e is 2.718282\n" ] } ], "source": [ "print (f\"The value of e is {np.e:.6f}\" )" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "∞=inf\n" ] } ], "source": [ "print ( \"∞={}\".format( np.inf ) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which is greater than any other number:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "200000000000000000000000000000000000000000000000\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0msin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: must be real number, not list" ] } ], "source": [ "sin([2,3])" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Lists vs NumPy arrays" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Supported methods for lists**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "`x.append x.count x.extend x.index x.insert x.pop x.remove x.reverse x.sort`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Supported methods for NumPy arrays**\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "x.T x.clip x.dot x.item x.prod x.setfield x.take\n", "x.all x.compress x.dtype x.itemset x.ptp x.setflags x.tofile\n", "x.any x.conj x.dump x.itemsize x.put x.shape x.tolist\n", "x.argmax x.conjugate x.dumps x.max x.ravel x.size x.tostring\n", "x.argmin x.copy x.fill x.mean x.real x.sort x.trace\n", "x.argsort x.ctypes x.flags x.min x.repeat x.squeeze x.transpose\n", "x.astype x.cumprod x.flat x.nbytes x.reshape x.std x.var\n", "x.base x.cumsum x.flatten x.ndim x.resize x.strides x.view\n", "x.byteswap x.data x.getfield x.newbyteorder x.round x.sum \n", "x.choose x.diagonal x.imag x.nonzero x.searchsorted x.swapaxes\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "55" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array([1,2,3,4,5,6,7,8,9,10]).sum()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**In common**\n", "\n", "Lists and numpy arrays can both store any type of data" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1.2, 3.5, 1.9] [ 1.6 -2.6 6.9]\n" ] } ], "source": [ "x1 = [1.2, 3.5, 1.9]\n", "x2 = np.array([1.6, -2.6, 6.9])\n", "print( x1, x2 )" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "numpy.ndarray" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(x2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For lists, new elements can be added using append method" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[1.2, 3.5, 1.9, 5.9]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x1 = [1.2, 3.5, 1.9]\n", "x2=x1.copy() # To compare later with numpy\n", "x1.append(5.9)\n", "x1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For arrays, new elements can be added using append function of NumPy" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1.2, 3.5, 1.9, 5.9])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.append(x2,5.9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A list can be converted into a numpy array, but the internal data type is homogenized. In the following example to float " ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([1. , 3.4, 1. ])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = [1,3.4,1.0]\n", "x = np.array(x)\n", "x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And a numpy array can be converted back to a (homogenized) list, as well" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.0, 3.4, 1.0]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(x)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**Differences**\n", "Operator + for lists is overloaded for concatenating" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "+: [1, 2, 3, 3, 2, 1], or np.concatenate: [1 2 3 3 2 1]\n" ] } ], "source": [ "x1 = [1,2,3]\n", "x2 = [3,2,1]\n", "print (f'+: {x1+x2}, or np.concatenate: {np.concatenate((x1,x2))}' )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Operator + for numpy arrays is overloaded for adding" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4 4 4]\n" ] } ], "source": [ "x1 = np.array([1,2,3])\n", "x2 = np.array([3,2,1])\n", "print (x1+x2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the main operators are overloaded _only_ for numpy arrays as element-wise operations\n", "\n", "* Multiplication" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "x1=np.array(x1)\n", "x2=np.array(x2)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 3.12 13.44 7.59]\n" ] } ], "source": [ "print(x1*x2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Division" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.46153846 1.71428571 6.27272727]\n" ] } ], "source": [ "print(x1/x2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Substraction" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-1.4 2. 5.8]\n" ] } ], "source": [ "print(x1-x2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Power" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1.6064649 80.81192733 8.37016462]\n" ] } ], "source": [ "print(x1**x2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**NumPy arrays**\n", "_Summary of element-wise operations_: Numpy arrays support any mathematical operation (element by element)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Adding [3.8 7.6 8. ]\n", "Multiplication [ 3.12 13.44 7.59]\n", "Division [0.46153846 1.71428571 6.27272727]\n", "Subtraction [-1.4 2. 5.8]\n", "Power [ 1.6064649 80.81192733 8.37016462]\n" ] } ], "source": [ "x1 = np.array([1.2,4.8,6.9])\n", "x2 = np.array([2.6,2.8,1.1])\n", "\n", "print (\"Adding\", x1+x2)\n", "print (\"Multiplication\", x1*x2)\n", "print (\"Division\", x1/x2)\n", "print (\"Subtraction\", x1-x2)\n", "print (\"Power\", x1**x2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**Note:** Matrices can be represented as NumPy arrays where each element is a row vector. Nevertheless, be careful when multiply arrays, the operator * is overloaded in such a way that single elements are multiplied one by one, quite different from multiplication of matrices.\n", "\n", "A detailed explanation will be given in the [Chapter about linear algebra](https://restrepo.github.io/ComputationalMethods/material/linear-algebra.html)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A=\n", "[[1 2]\n", " [3 4]]\n", "B=\n", "[[4 3]\n", " [2 1]]\n", "A*B=\n", "[[4 6]\n", " [6 4]]\n" ] } ], "source": [ "A = np.array([[1,2],[3,4]])\n", "B = np.array([[4,3],[2,1]])\n", "print (f'A=\\n{A}') \n", "print (f'B=\\n{B}')\n", "print (f'A*B=\\n{A*B}') # No matrix multiplication" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "A matrix element can be accessed directly" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A[0,1]" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A[0][1]" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Scientific Python abstractions\n", "\n", "Abstractions in scientific Python refer to the use of high-level constructs or interfaces that simplify complex operations and allow users to focus on the concepts and results rather than the underlying implementation details. Examples of abstractions in scientific Python include NumPy arrays for efficient numerical operations, Pandas dataframes for tabular data manipulation, and Matplotlib for data visualization. Abstractions help to make scientific computing more accessible and efficient for users.\n", "\n", "All of them have methods to implement internal loops in faster programming languages that the ones in Python." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Examples of abstractions in NumPy that can be used to avoid for loops are:\n", "\n", "1. Vectorized operations: NumPy supports many element-wise mathematical operations that can be performed directly on entire arrays, without the need for explicit for loops. For example, to multiply every element of an array by a constant, you can simply use the `*` operator, like this: `my_array * 5`.\n", "1. Broadcasting: Broadcasting is a powerful feature in NumPy that allows arrays with different shapes to be used in arithmetic operations. Broadcasting can often be used as a more efficient alternative to for loops. For example, to add a scalar value to every row of a 2D array, you can simply add the scalar to the entire array, like this: `my_array + 5`.\n", "1. Masking and boolean indexing: NumPy provides powerful indexing capabilities that can be used to select subsets of an array based on some condition. For example, to select all elements of an array that are greater than 5, you can use a boolean mask, like this: `my_array[my_array > 5]`. This is often more efficient than using an explicit for loop to iterate over the array.\n", "1. Aggregation functions: NumPy provides many functions that can be used to compute summary statistics over an entire array, such as `np.sum()`, `np.mean()`, and `np.std()`. These functions abstract away the need for explicit for loops over the array.\n", "\n", "Below is given a more detailed explanation for the last two." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Masks\n", "A _mask_ is a boolean array that is used to index into the original array with the same shape in order to select only the elements that satisfy the condition represented by the mask.\n", "The mask is applied as a superspostion of a logical array unpon the original one. The output filters only the `True` values " ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1,2,3,4])\n", "y = np.array([False, False, True, True])\n", "x[y]" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#It is possible to access elements of a numpy array using booleans\n", "x = np.array([1,2,3,4])\n", "y = np.array([False, False, True, True])\n", "x[y]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Automatic creation of masks" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([False, False, True, True])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x>2" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[x>2] # Automatic mask implementation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "which is much better and faster than:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xfin=[]\n", "for i in x:\n", " if i>2:\n", " xfin.append(i)\n", " \n", "xfin=np.array(xfin)\n", "xfin" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[False False True False]\n", "[False True False True]\n", "[ True False False False]\n", "[False True True True]\n", "[False True True False]\n" ] } ], "source": [ "#Operators >, <, >=, <= and ==, != are also overloaded for numpy arrays\n", "x = np.array([0,5,8,0])\n", "y = np.array([0,6,5,1])\n", "print(x>y) \n", "print(x4) " ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([4, 6, 8, 4, 9, 6, 7])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Combining these features, we can perform searches and comparisons far more efficient\n", "x = np.array([1,4,2,6,8,4,3,0,9,1,3,6,7])\n", "#A new list with numbers greater than 4\n", "x[x>=4]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Numpy has logical operators: `np.logical_...`\n", "\n", "For `array([1,4,2,6,8,4,3,0,9,1,3,6,7])`:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 8, 0, 9, 1, 7])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[ (x>6) | (x<2) ]" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([4, 4, 3, 3])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[ np.logical_and(x>2, x<6) ]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "or" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([4, 4, 3, 3])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[ (x>2) & (x<6) ] #or: |" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "the full mask can be negated!\n", "\n", "For `array([1,4,2,6,8,4,3,0,9,1,3,6,7])`:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 6, 8, 0, 9, 1, 6, 7])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[~((x>2) & (x<6)) ]" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Aggregation functions \n", "spreadsheeet like operations" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Maximum element 9\n", "Minimum element 0\n" ] }, { "data": { "text/plain": [ "('Mean value', 4.153846153846154)" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Native methods of numpy arrays allow to calculate basic quantities\n", "x = np.array([1,4,2,6,8,4,3,0,9,1,3,6,7])\n", "#Maximum element\n", "print( \"Maximum element\", x.max() )\n", "#Minimum element\n", "print( \"Minimum element\", x.min())\n", "#Mean value\n", "\"Mean value\", x.mean()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Upon the indices" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 4 2 6 8 4 3 0 9 1 3 6 7]\n" ] } ], "source": [ "print(x)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sorted arguments [ 0 1 2 3 4 5 6 7 8 9 10 11 12]\n", "Sorted array [0 1 1 2 3 3 4 4 6 6 7 8 9]\n" ] } ], "source": [ "#Sorted arguments of the array\n", "print (\"Sorted arguments\", x.argsort() )\n", "#Sorted array\n", "print ( \"Sorted array\", x[x.argsort()])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Direct sorted" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 1, 2, 3, 3, 4, 4, 6, 6, 7, 8, 9])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1,4,2,6,8,4,3,0,9,1,3,6,7])\n", "x.sort()\n", "x" ] }, { "cell_type": "raw", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "or" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 1, 2, 3, 3, 4, 4, 6, 6, 7, 8, 9])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1,4,2,6,8,4,3,0,9,1,3,6,7])\n", "np.sort( x )" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Miscellaneous methods" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "NumPy includes many general purpose functions that complement the capabilities of python. We are interested here specially in functions for creating ordered arrays, storing and loading data as well as histograms, tasks that will be continuously required for the activities of the course." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([1., 1., 1., 1., 1.])" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create an array of 1's with a given size (even 2D sizes)\n", "x = np.ones(5)\n", "x" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0., 0., 0.],\n", " [0., 0., 0., 0., 0.]])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Create an array of zeros with a given size (even 2D sizes)\n", "x = np.zeros( (2,5) )\n", "x" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "30" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "x=np.array([4,7,4,6,9])\n", "x.sum()" ] }, { "cell_type": "markdown", "metadata": { "jp-MarkdownHeadingCollapsed": true, "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### Miscellaneous functions" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-3.14159265 -2.44346095 -1.74532925 -1.04719755 -0.34906585 0.34906585\n", " 1.04719755 1.74532925 2.44346095 3.14159265]\n" ] } ], "source": [ "#Create an array with a given range and a number of intervals\n", "x = np.linspace( -np.pi, np.pi, 10 ) \n", "print (x)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For arrays that expands more than one order of magnitud, use" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([2.00000000e+00, 6.16155028e+00, 1.89823509e+01, 5.84803548e+01,\n", " 1.80164823e+02, 5.55047308e+02, 1.70997595e+03, 5.26805138e+03,\n", " 1.62296817e+04, 5.00000000e+04])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.logspace( np.log10(2), np.log10(50000),10 )" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([2.00000000e+00, 5.55733333e+03, 1.11126667e+04, 1.66680000e+04,\n", " 2.22233333e+04, 2.77786667e+04, 3.33340000e+04, 3.88893333e+04,\n", " 4.44446667e+04, 5.00000000e+04])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linspace( 2, 50000,10 )" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1. 1.2 1.4 1.6 1.8 2. 2.2 2.4 2.6 2.8 3. 3.2 3.4 3.6 3.8 4. 4.2 4.4\n", " 4.6 4.8]\n" ] } ], "source": [ "#Create an array with a given range and a given step\n", "x = np.arange( 1, 5, 0.2 )\n", "print( x )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "#Using the function savetxt, it is possible to store data from a numpy array\n", "data = np.array([[3.2, 2.1],[3.1, 4.1]])\n", "np.savetxt( \"file.dat\", data, fmt=\"%1.5e %1.5e\" )" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "cat file.dat" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "#In the same way, using the function loadtxt it is possible to load external data files\n", "data = np.loadtxt(\"file.dat\")\n", "#Data is then a multidimensional array with the loaded data\n", "print( data )" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Other useful functions will be covered when needed during the course." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "source": [ "### List Slices\n", "Slices work on list-like objects like numpy arrays, and can also be used to change sub-parts of the list. " ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([1., 2., 2., 1., 1., 1., 1., 6., 6., 8.])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x=np.ones(10)\n", "x[1:3]=2\n", "x[-1]=8\n", "x[-3:-1]=6\n", "x" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Reverse order (arrays do not have the `reverse()` method)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([8., 6., 6., 1., 1., 1., 1., 2., 2., 1.])" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[::-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pandas" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", " \n", "
\n", "\n", "See the [Pandas Chapter](https://restrepo.github.io/ComputationalMethods/material/Pandas.html) for details" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
evenodd
001
123
245
367
489
\n", "
" ], "text/plain": [ " even odd\n", "0 0 1\n", "1 2 3\n", "2 4 5\n", "3 6 7\n", "4 8 9" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers={\"even\": [0,2,4,6,8], # First key-list\n", " \"odd\" : [1,3,5,7,9] } # Second key-list\n", "\n", "import pandas as pd\n", "pd.set_option('display.max_colwidth',200)\n", "df=pd.DataFrame(numbers)\n", "df" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In the previous DataFrame, all the column values are converted to Numpy arrays, which is the basic object in Numpy corresponding to generalized nested lists;" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 2\n", "2 4\n", "3 6\n", "4 8\n", "Name: even, dtype: int64" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.even" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([0, 2, 4, 6, 8])" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.even.values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SciPy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "SciPy is a collection of mathematical algorithms and convenience functions built on the Numpy extension of Python. It adds significant power to the interactive Python session by providing the user with high-level commands and classes for manipulating and visualizing data. With SciPy an interactive Python session becomes a data-processing and system-prototyping environment rivaling sytems such as MATLAB, IDL, Octave, R-Lab, and SciLab.\n", "\n", "Some of the packages included with SciPy are:\n", "\n", "* Special functions (**scipy.special**)\n", "* Integration (**scipy.integrate**)\n", "* Optimization (**scipy.optimize**)\n", "* Interpolation (**scipy.interpolate**)\n", "* Fourier Transforms (**scipy.fftpack**)\n", "* Signal Processing (**scipy.signal**)\n", "* Linear Algebra (**scipy.linalg**)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Each of these packages must be imported separately.\n", "Almost each of the numerical methods that will be covered during the course can be found in SciPy. For example, \n", "To import the `integrate` package, use" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "import scipy.integrate as integ" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The integrate package then includes the next functions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "integ.Tester integ.fixed_quad integ.odepack integ.quadrature integ.test \n", "integ.complex_ode integ.newton_cotes integ.quad integ.romb integ.tplquad \n", "integ.cumtrapz integ.ode integ.quad_explain integ.romberg integ.trapz \n", "integ.dblquad integ.odeint integ.quadpack integ.simps integ.vode " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "In next classes we will explore the offered options by SciPy according to the specific methods covered." ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "217px", "width": "252px" }, "navigate_menu": true, "number_sections": true, "sideBar": true, "threshold": 4, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }