.. _arrays.indexing:
Indexing
========
.. sectionauthor:: adapted from "Guide to Numpy" by Travis E. Oliphant
.. currentmodule:: numpy
.. index:: indexing, slicing
:class:`ndarrays ` can be indexed using the standard Python
``x[obj]`` syntax, where *x* is the array and *obj* the selection.
There are three kinds of indexing available: record access, basic
slicing, advanced indexing. Which one occurs depends on *obj*.
.. note::
In Python, ``x[(exp1, exp2, ..., expN)]`` is equivalent to
``x[exp1, exp2, ..., expN]``; the latter is just syntactic sugar
for the former.
Basic Slicing
-------------
Basic slicing extends Python's basic concept of slicing to N
dimensions. Basic slicing occurs when *obj* is a :class:`slice` object
(constructed by ``start:stop:step`` notation inside of brackets), an
integer, or a tuple of slice objects and integers. :const:`Ellipsis`
and :const:`newaxis` objects can be interspersed with these as
well. In order to remain backward compatible with a common usage in
Numeric, basic slicing is also initiated if the selection object is
any sequence (such as a :class:`list`) containing :class:`slice`
objects, the :const:`Ellipsis` object, or the :const:`newaxis` object,
but no integer arrays or other embedded sequences.
.. index::
triple: ndarray; special methods; getslice
triple: ndarray; special methods; setslice
single: ellipsis
single: newaxis
The simplest case of indexing with *N* integers returns an :ref:`array
scalar ` representing the corresponding item. As in
Python, all indices are zero-based: for the *i*-th index :math:`n_i`,
the valid range is :math:`0 \le n_i < d_i` where :math:`d_i` is the
*i*-th element of the shape of the array. Negative indices are
interpreted as counting from the end of the array (*i.e.*, if *i < 0*,
it means :math:`n_i + i`).
All arrays generated by basic slicing are always :term:`views `
of the original array.
The standard rules of sequence slicing apply to basic slicing on a
per-dimension basis (including using a step index). Some useful
concepts to remember include:
- The basic slice syntax is ``i:j:k`` where *i* is the starting index,
*j* is the stopping index, and *k* is the step (:math:`k\neq0`).
This selects the *m* elements (in the corresponding dimension) with
index values *i*, *i + k*, ..., *i + (m - 1) k* where
:math:`m = q + (r\neq0)` and *q* and *r* are the quotient and remainder
obtained by dividing *j - i* by *k*: *j - i = q k + r*, so that
*i + (m - 1) k < j*.
.. admonition:: Example
>>> x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> x[1:7:2]
array([1, 3, 5])
- Negative *i* and *j* are interpreted as *n + i* and *n + j* where
*n* is the number of elements in the corresponding dimension.
Negative *k* makes stepping go towards smaller indices.
.. admonition:: Example
>>> x[-2:10]
array([8, 9])
>>> x[-3:3:-1]
array([7, 6, 5, 4])
- Assume *n* is the number of elements in the dimension being
sliced. Then, if *i* is not given it defaults to 0 for *k > 0* and
*n* for *k < 0* . If *j* is not given it defaults to *n* for *k > 0*
and -1 for *k < 0* . If *k* is not given it defaults to 1. Note that
``::`` is the same as ``:`` and means select all indices along this
axis.
.. admonition:: Example
>>> x[5:]
array([5, 6, 7, 8, 9])
- If the number of objects in the selection tuple is less than
*N* , then ``:`` is assumed for any subsequent dimensions.
.. admonition:: Example
>>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
>>> x.shape
(2, 3, 1)
>>> x[1:2]
array([[[4],
[5],
[6]]])
- :const:`Ellipsis` expand to the number of ``:`` objects needed to
make a selection tuple of the same length as ``x.ndim``. Only the
first ellipsis is expanded, any others are interpreted as ``:``.
.. admonition:: Example
>>> x[...,0]
array([[1, 2, 3],
[4, 5, 6]])
- Each :const:`newaxis` object in the selection tuple serves to expand
the dimensions of the resulting selection by one unit-length
dimension. The added dimension is the position of the :const:`newaxis`
object in the selection tuple.
.. admonition:: Example
>>> x[:,np.newaxis,:,:].shape
(2, 1, 3, 1)
- An integer, *i*, returns the same values as ``i:i+1``
**except** the dimensionality of the returned object is reduced by
1. In particular, a selection tuple with the *p*-th
element an integer (and all other entries ``:``) returns the
corresponding sub-array with dimension *N - 1*. If *N = 1*
then the returned object is an array scalar. These objects are
explained in :ref:`arrays.scalars`.
- If the selection tuple has all entries ``:`` except the
*p*-th entry which is a slice object ``i:j:k``,
then the returned array has dimension *N* formed by
concatenating the sub-arrays returned by integer indexing of
elements *i*, *i+k*, ..., *i + (m - 1) k < j*,
- Basic slicing with more than one non-``:`` entry in the slicing
tuple, acts like repeated application of slicing using a single
non-``:`` entry, where the non-``:`` entries are successively taken
(with all other non-``:`` entries replaced by ``:``). Thus,
``x[ind1,...,ind2,:]`` acts like ``x[ind1][...,ind2,:]`` under basic
slicing.
.. warning:: The above is **not** true for advanced slicing.
- You may use slicing to set values in the array, but (unlike lists) you
can never grow the array. The size of the value to be set in
``x[obj] = value`` must be (broadcastable) to the same shape as
``x[obj]``.
.. index::
pair: ndarray; view
.. note::
Remember that a slicing tuple can always be constructed as *obj*
and used in the ``x[obj]`` notation. Slice objects can be used in
the construction in place of the ``[start:stop:step]``
notation. For example, ``x[1:10:5,::-1]`` can also be implemented
as ``obj = (slice(1,10,5), slice(None,None,-1)); x[obj]`` . This
can be useful for constructing generic code that works on arrays
of arbitrary dimension.
.. data:: newaxis
The :const:`newaxis` object can be used in the basic slicing syntax
discussed above. :const:`None` can also be used instead of
:const:`newaxis`.
Advanced indexing
-----------------
Advanced indexing is triggered when the selection object, *obj*, is a
non-tuple sequence object, an :class:`ndarray` (of data type integer or bool),
or a tuple with at least one sequence object or ndarray (of data type
integer or bool). There are two types of advanced indexing: integer
and Boolean.
Advanced indexing always returns a *copy* of the data (contrast with
basic slicing that returns a :term:`view`).
Integer
^^^^^^^
Integer indexing allows selection of arbitrary items in the array
based on their *N*-dimensional index. This kind of selection occurs
when advanced indexing is triggered and the selection object is not
an array of data type bool. For the discussion below, when the
selection object is not a tuple, it will be referred to as if it had
been promoted to a 1-tuple, which will be called the selection
tuple. The rules of advanced integer-style indexing are:
- If the length of the selection tuple is larger than *N* an error is raised.
- All sequences and scalars in the selection tuple are converted to
:class:`intp` indexing arrays.
- All selection tuple objects must be convertible to :class:`intp`
arrays, :class:`slice` objects, or the :const:`Ellipsis` object.
- The first :const:`Ellipsis` object will be expanded, and any other
:const:`Ellipsis` objects will be treated as full slice (``:``)
objects. The expanded :const:`Ellipsis` object is replaced with as
many full slice (``:``) objects as needed to make the length of the
selection tuple :math:`N`.
- If the selection tuple is smaller than *N*, then as many ``:``
objects as needed are added to the end of the selection tuple so
that the modified selection tuple has length *N*.
- All the integer indexing arrays must be :ref:`broadcastable
` to the same shape.
- The shape of the output (or the needed shape of the object to be used
for setting) is the broadcasted shape.
- After expanding any ellipses and filling out any missing ``:``
objects in the selection tuple, then let :math:`N_t` be the number
of indexing arrays, and let :math:`N_s = N - N_t` be the number of
slice objects. Note that :math:`N_t > 0` (or we wouldn't be doing
advanced integer indexing).
- If :math:`N_s = 0` then the *M*-dimensional result is constructed by
varying the index tuple ``(i_1, ..., i_M)`` over the range
of the result shape and for each value of the index tuple
``(ind_1, ..., ind_M)``::
result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M],
..., ind_N[i_1, ..., i_M]]
.. admonition:: Example
Suppose the shape of the broadcasted indexing arrays is 3-dimensional
and *N* is 2. Then the result is found by letting *i, j, k* run over
the shape found by broadcasting ``ind_1`` and ``ind_2``, and each
*i, j, k* yields::
result[i,j,k] = x[ind_1[i,j,k], ind_2[i,j,k]]
- If :math:`N_s > 0`, then partial indexing is done. This can be
somewhat mind-boggling to understand, but if you think in terms of
the shapes of the arrays involved, it can be easier to grasp what
happens. In simple cases (*i.e.* one indexing array and *N - 1* slice
objects) it does exactly what you would expect (concatenation of
repeated application of basic slicing). The rule for partial
indexing is that the shape of the result (or the interpreted shape
of the object to be used in setting) is the shape of *x* with the
indexed subspace replaced with the broadcasted indexing subspace. If
the index subspaces are right next to each other, then the
broadcasted indexing space directly replaces all of the indexed
subspaces in *x*. If the indexing subspaces are separated (by slice
objects), then the broadcasted indexing space is first, followed by
the sliced subspace of *x*.
.. admonition:: Example
Suppose ``x.shape`` is (10,20,30) and ``ind`` is a (2,3,4)-shaped
indexing :class:`intp` array, then ``result = x[...,ind,:]`` has
shape (10,2,3,4,30) because the (20,)-shaped subspace has been
replaced with a (2,3,4)-shaped broadcasted indexing subspace. If
we let *i, j, k* loop over the (2,3,4)-shaped subspace then
``result[...,i,j,k,:] = x[...,ind[i,j,k],:]``. This example
produces the same result as :meth:`x.take(ind, axis=-2) `.
.. admonition:: Example
Now let ``x.shape`` be (10,20,30,40,50) and suppose ``ind_1``
and ``ind_2`` are broadcastable to the shape (2,3,4). Then
``x[:,ind_1,ind_2]`` has shape (10,2,3,4,40,50) because the
(20,30)-shaped subspace from X has been replaced with the
(2,3,4) subspace from the indices. However,
``x[:,ind_1,:,ind_2]`` has shape (2,3,4,10,30,50) because there
is no unambiguous place to drop in the indexing subspace, thus
it is tacked-on to the beginning. It is always possible to use
:meth:`.transpose() ` to move the subspace
anywhere desired. (Note that this example cannot be replicated
using :func:`take`.)
Boolean
^^^^^^^
This advanced indexing occurs when obj is an array object of Boolean
type (such as may be returned from comparison operators). It is always
equivalent to (but faster than) ``x[obj.nonzero()]`` where, as
described above, :meth:`obj.nonzero() ` returns a
tuple (of length :attr:`obj.ndim `) of integer index
arrays showing the :const:`True` elements of *obj*.
The special case when ``obj.ndim == x.ndim`` is worth mentioning. In
this case ``x[obj]`` returns a 1-dimensional array filled with the
elements of *x* corresponding to the :const:`True` values of *obj*.
The search order will be C-style (last index varies the fastest). If
*obj* has :const:`True` values at entries that are outside of the
bounds of *x*, then an index error will be raised.
You can also use Boolean arrays as element of the selection tuple. In
such instances, they will always be interpreted as :meth:`nonzero(obj)
` and the equivalent integer indexing will be
done.
.. warning::
The definition of advanced indexing means that ``x[(1,2,3),]`` is
fundamentally different than ``x[(1,2,3)]``. The latter is
equivalent to ``x[1,2,3]`` which will trigger basic selection while
the former will trigger advanced indexing. Be sure to understand
why this is occurs.
Also recognize that ``x[[1,2,3]]`` will trigger advanced indexing,
whereas ``x[[1,2,slice(None)]]`` will trigger basic slicing.
.. note::
XXX: this section may need some tuning...
Also the above warning needs explanation as the last part is at odds
with the definition of basic indexing.
.. _arrays.indexing.rec:
Record Access
-------------
.. seealso:: :ref:`arrays.dtypes`, :ref:`arrays.scalars`
If the :class:`ndarray` object is a record array, *i.e.* its data type
is a :term:`record` data type, the :term:`fields ` of the array
can be accessed by indexing the array with strings, dictionary-like.
Indexing ``x['field-name']`` returns a new :term:`view` to the array,
which is of the same shape as *x* (except when the field is a
sub-array) but of data type ``x.dtype['field-name']`` and contains
only the part of the data in the specified field. Also record array
scalars can be "indexed" this way.
If the accessed field is a sub-array, the dimensions of the sub-array
are appended to the shape of the result.
.. admonition:: Example
>>> x = np.zeros((2,2), dtype=[('a', np.int32), ('b', np.float64, (3,3))])
>>> x['a'].shape
(2, 2)
>>> x['a'].dtype
dtype('int32')
>>> x['b'].shape
(2, 2, 3, 3)
>>> x['b'].dtype
dtype('float64')
Flat Iterator indexing
----------------------
:attr:`x.flat ` returns an iterator that will iterate
over the entire array (in C-contiguous style with the last index
varying the fastest). This iterator object can also be indexed using
basic slicing or advanced indexing as long as the selection object is
not a tuple. This should be clear from the fact that :attr:`x.flat
` is a 1-dimensional view. It can be used for integer
indexing with 1-dimensional C-style-flat indices. The shape of any
returned array is therefore the shape of the integer indexing object.
.. index::
single: indexing
single: ndarray