Skip to content
Surf Wiki
Save to docs
linguistics

From Surf Wiki (app.surf) — the open knowledge base

NumPy

Python library for numerical programming


Python library for numerical programming

FieldValue
titleNumPy
nameNumPy
logoNumPy logo 2020.svg
logo_size200px
screenshotNumPy Matplotlib sin x plotted with red dots.svg
captionPlot of y=sin(x) function, created with NumPy and Matplotlib libraries
authorTravis Oliphant
developerCommunity project
releasedAs Numeric, ; as NumPy,
latest release version
latest release date
programming languagePython, C
operating systemCross-platform
genreNumerical analysis
licenseBSD

NumPy (pronounced ) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. The predecessor of NumPy, Numeric, was originally created by Jim Hugunin with contributions from several other developers. In 2005, Travis Oliphant created NumPy by incorporating features of the competing Numarray into Numeric, with extensive modifications. NumPy is open-source software and has many contributors. NumPy is fiscally sponsored by NumFOCUS.

History

matrix-sig

The Python programming language was not originally designed for numerical computing, but attracted the attention of the scientific and engineering community early on. In 1995 the special interest group (SIG) matrix-sig was founded with the aim of defining an array computing package; among its members was Python designer and maintainer Guido van Rossum, who extended Python's syntax (in particular the indexing syntax) to make array computing easier.

Numeric

An implementation of a matrix package was completed by Jim Fulton, then generalized by Jim Hugunin and called Numeric{{cite journal |access-date=2014-07-07 |archive-date=2019-02-19 |archive-url=https://web.archive.org/web/20190219031439/https://www.computer.org/csdl/mags/cs/2011/02/mcs2011020009.html |url-status=dead Hugunin, a graduate student at the Massachusetts Institute of Technology (MIT), joined the Corporation for National Research Initiatives (CNRI) in 1997 to work on JPython, leaving Paul Dubois of Lawrence Livermore National Laboratory (LLNL) to take over as maintainer. Other early contributors include David Ascher, Konrad Hinsen and Travis Oliphant.

Numarray

A new package called Numarray was written as a more flexible replacement for Numeric. Like Numeric, it too is now deprecated.{{cite book | access-date = 2 February 2017 | access-date = 2 February 2017

There was a desire to get Numeric into the Python standard library, but Guido van Rossum decided that the code was not maintainable in its state then.

NumPy

In early 2005, NumPy developer Travis Oliphant wanted to unify the community around a single array package and ported Numarray's features to Numeric, releasing the result as NumPy 1.0 in 2006. This new project was part of SciPy. To avoid installing the large SciPy package just to get an array object, this new package was separated and called NumPy. Support for Python 3 was added in 2011 with NumPy version 1.5.0.

In 2011, PyPy started development on an implementation of the NumPy API for PyPy. As of 2023, it is not yet fully compatible with NumPy.

Features

NumPy targets the CPython reference implementation of Python, which is a non-optimizing bytecode interpreter. Mathematical algorithms written for this version of Python often run much slower than compiled equivalents due to the absence of compiler optimization. NumPy addresses the slowness problem partly by providing multidimensional arrays and functions and operators that operate efficiently on arrays; using these requires rewriting some code, mostly inner loops, using NumPy.

Using NumPy in Python gives functionality comparable to MATLAB since they are both interpreted,{{cite web | access-date = 2 February 2017

Python bindings of the widely used computer vision library OpenCV utilize NumPy arrays to store and operate on data. Since images with multiple channels are simply represented as three-dimensional arrays, indexing, slicing or masking with other arrays are very efficient ways to access specific pixels of an image. The NumPy array as universal data structure in OpenCV for images, extracted feature points, filter kernels and many more vastly simplifies the programming workflow and debugging.

Importantly, many NumPy operations release the global interpreter lock, which allows for multithreaded processing.

NumPy also provides a C API, which allows Python code to interoperate with external libraries written in low-level languages.

The ndarray data structure

The core functionality of NumPy is its "ndarray", for n-dimensional array, data structure. These arrays are strided views on memory. In contrast to Python's built-in list data structure, these arrays are homogeneously typed: all elements of a single array must be of the same type.

Such arrays can also be views into memory buffers allocated by C/C++, Python, and Fortran extensions to the CPython interpreter without the need to copy data around, giving a degree of compatibility with existing numerical libraries. This functionality is exploited by the SciPy package, which wraps a number of such libraries (notably BLAS and LAPACK). NumPy has built-in support for memory-mapped ndarrays.

Limitations

Inserting or appending entries to an array is not as trivially possible as it is with Python's lists. The routine to extend arrays actually creates new arrays of the desired shape and padding values, copies the given array into the new one and returns it. NumPy's operation does not actually link the two arrays but returns a new one, filled with the entries from both given arrays in sequence. Reshaping the dimensionality of an array with is only possible as long as the number of elements in the array does not change. These circumstances originate from the fact that NumPy's arrays must be views on contiguous memory buffers.

Algorithms that are not expressible as a vectorized operation will typically run slowly because they must be implemented in "pure Python", while vectorization may increase memory complexity of some operations from constant to linear, because temporary arrays must be created that are as large as the inputs. Runtime compilation of numerical code has been implemented by several groups to avoid these problems; open source solutions that interoperate with NumPy include numexpr and Numba. Cython and Pythran are static-compiling alternatives to these.

Many modern large-scale scientific computing applications have requirements that exceed the capabilities of the NumPy arrays. For example, NumPy arrays are usually loaded into a computer's memory, which might have insufficient capacity for the analysis of large datasets. Further, NumPy operations are executed on a single CPU. However, many linear algebra operations can be accelerated by executing them on clusters of CPUs or of specialized hardware, such as GPUs and TPUs, which many deep learning applications rely on. As a result, several alternative array implementations have arisen in the scientific python ecosystem over the recent years, such as Dask for distributed arrays and TensorFlow or JAX for computations on GPUs. Because of its popularity, these often implement a subset of NumPy's API or mimic it, so that users can change their array implementation with minimal changes to their code required. A library named CuPy, accelerated by Nvidia's CUDA framework, has also shown potential for faster computing, being a 'drop-in replacement' of NumPy.

Examples

NumPy is conventionally imported as .

import numpy as np
from numpy.typing import NDArray

a: NDArray[int] = np.array([[1, 2, 3, 4], [3, 4, 6, 7], [5, 9, 0, 5]])
a.transpose()

Basic operations

from numpy.typing import NDArray

a: NDArray[int] = np.array([1, 2, 3, 6])
b: NDArray[int] = np.linspace(0, 2, 4)  # create an array with four equally spaced points starting with 0 and ending with 2.
c: NDArray[int] = a - b
print(c)
# prints array([ 1.        ,  1.33333333,  1.66666667,  4.        ])
print(a ** 2)
# prints array([ 1,  4,  9, 36])

Universal functions

from numpy.typing import NDArray, float64

a: NDArray[float64] = np.linspace(-np.pi, np.pi, 100) 
b: float64 = np.sin(a)
c: float64 = np.cos(a)

# Functions can take both numbers and arrays as parameters.
print(np.sin(1))
# prints 0.8414709848078965
print(np.sin(np.array([1, 2, 3])))
# prints array([0.84147098, 0.90929743, 0.14112001])

Linear algebra

import numpy as np
from numpy.linalg import solve, inv
from numpy.random import rand
from numpy.typing import NDArray, float32
a: NDArray[float32] = np.array([[1, 2, 3], [3, 4, 6.7], [5, 9.0, 5]])
print(a.transpose())
# prints:
# array([[ 1. ,  3. ,  5. ],
#        [ 2. ,  4. ,  9. ],
#        [ 3. ,  6.7,  5. ]])
print(inv(a))
# prints:
# array([[-2.27683616,  0.96045198,  0.07909605],
#        [ 1.04519774, -0.56497175,  0.1299435 ],
#        [ 0.39548023,  0.05649718, -0.11299435]])
b: NDArray[int] =  np.array([3, 2, 1])
print(solve(a, b)) # solve the equation ax = b
# prints array([-4.83050847,  2.13559322,  1.18644068])
c: NDArray[float32] = rand(3, 3) * 20  # create a 3x3 random matrix of values within [0,1] scaled by 20
print(c)
# prints:
# array([[  3.98732789,   2.47702609,   4.71167924],
#        [  9.24410671,   5.5240412 ,  10.6468792 ],
#        [ 10.38136661,   8.44968437,  15.17639591]])
print(np.dot(a, c)) # matrix multiplication
# prints:
# array([[  53.61964114,   38.8741616 ,   71.53462537],
#        [ 118.4935668 ,   86.14012835,  158.40440712],
#        [ 155.04043289,  104.3499231 ,  195.26228855]])
print(a @ c) # Starting with Python 3.5 and NumPy 1.10
# prints:
# array([[  53.61964114,   38.8741616 ,   71.53462537],
#        [ 118.4935668 ,   86.14012835,  158.40440712],
#        [ 155.04043289,  104.3499231 ,  195.26228855]])

Multidimensional arrays

import numpy as np
from numpy.typing import NDArray, float64

M: NDArray[float64] = np.zeros(shape=(2, 3, 5, 7, 11))
T: NDArray[float64] = np.transpose(M, (4, 2, 1, 3, 0))
print(T.shape)
# prints (11, 5, 3, 7, 2)

Incorporation with OpenCV

import cv2
import numpy as np
from numpy.typing import NDArray, float32

r: NDArray[float32] = np.reshape(np.arange(256*256)%256,(256,256)) # 256x256 pixel array with a horizontal gradient from 0 to 255 for the red color channel
g: NDArray[float32] = np.zeros_like(r) # array of same size and type as r but filled with 0s for the green color channel
b: NDArray[float32] = r.T # transposed r will give a vertical gradient for the blue color channel
print(cv2.imwrite("gradients.png", np.dstack([b,g,r]))) # OpenCV images are interpreted as BGR, the depth-stacked array will be written to an 8bit RGB PNG-file called "gradients.png"
# prints True

Functional Python and vectorized NumPy version.

### Functional Python ###
from typing import Callable

points: list[list[int]] = [[9,2,8],[4,7,2],[3,4,4],[5,6,9],[5,0,7],[8,2,7],[0,3,2],[7,3,0],[6,1,1],[2,9,6]]
qPoint: list[int] = [4,5,3]
# Lambda function for calculating the Euclidean distance of two vectors
edistance: Callable[[list[float], list[float]], float] = lambda a, b: sum((a1 - b1) ** 2 for a1, b1 in zip(a, b)) ** 0.5 
# Compute all Euclidean distances at once and return the nearest point
nearest: list[int] = min((edistance(i, qpoint), i) for i in points)[1] 
print(f"Nearest point to q: {nearest}")
# prints Nearest point to q: [3, 4, 4]

### Equivalent NumPy vectorization ###
import numpy as np
from numpy.typing import NDArray

points: NDArray[int] = np.array([[9,2,8],[4,7,2],[3,4,4],[5,6,9],[5,0,7],[8,2,7],[0,3,2],[7,3,0],[6,1,1],[2,9,6]])
qPoint: NDArray[int] = np.array([4,5,3])
minIdx: int = np.argmin(np.linalg.norm(points-qPoint, axis=1))  # compute all euclidean distances at once and return the index of the smallest one
print(f"Nearest point to q: {points[minIdx]}")
# prints Nearest point to q: [3 4 4]

F2PY

Quickly wrap native code for faster scripts.

! Python Fortran native code call example
! f2py -c -m foo *.f90
! Compile Fortran into python named module using intent statements
! Fortran subroutines only not functions--easier than JNI with C wrapper
! requires gfortran and make
subroutine ftest(a, b, n, c, d)
  implicit none
  integer, intent(in)  :: a, b, n
  integer, intent(out) :: c, d
  integer :: i
  c = 0
  do i = 1, n
    c = a + b + c
  end do
  d = (c * n) * (-1)
end subroutine ftest
import foo
import numpy as np

a: tuple[int, int] = foo.ftest(1, 2, 3)  # or c,d = instead of a.c and a.d
print(a)
# prints (9,-27)
help("foo.ftest") 
# prints the foo.ftest.__doc__

References

References

  1. "NumPy — NumPy". NumPy developers.
  2. {{cite Q. Q99413970
  3. "NumFOCUS Sponsored Projects". NumFOCUS.
  4. "Indexing — NumPy v1.20 Manual".
  5. Travis Oliphant. (2007). "Python for Scientific Computing". Computing in Science and Engineering.
  6. (1999). "Numerical Python".
  7. "Numarray Homepage".
  8. "NumPy Sourceforge Files".
  9. "History_of_SciPy - SciPy wiki dump".
  10. "NumPy 1.5.0 Release Notes".
  11. "PyPy Status Blog: NumPy funding and status update".
  12. "NumPyPy Status".
  13. "numpy release notes".
  14. (2014). "Python for Data Analysis". O'Reilly.
  15. (2011). "The NumPy array: a structure for efficient numerical computation". IEEE.
  16. Francesc Alted. "numexpr".
  17. "Numba".
  18. Documentationː {{URL. https://jax.readthedocs.io/
  19. "Shohei Hido - CuPy: A NumPy-compatible Library for GPU - PyCon 2018".
  20. Entschev, Peter Andreas. (2019-07-23). "Single-GPU CuPy Speedups".
  21. "NumPy: the absolute basics for beginners § How to import NumPy".
  22. "F2PY docs from NumPy". NumPy.
  23. (3 January 2022). "A python vs. Fortran smackdown".
  24. "Writing fast Fortran routines for Python". University of California, Santa Barbara.
Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about NumPy — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report