.. _arrays.indexing: Indexing ======== .. sectionauthor:: adapted from "Guide to Numpy" by Travis E. Oliphant .. currentmodule:: numpy .. index:: indexing, slicing :class:`ndarrays ` can be indexed using the standard Python ``x[obj]`` syntax, where *x* is the array and *obj* the selection. There are three kinds of indexing available: record access, basic slicing, advanced indexing. Which one occurs depends on *obj*. .. note:: In Python, ``x[(exp1, exp2, ..., expN)]`` is equivalent to ``x[exp1, exp2, ..., expN]``; the latter is just syntactic sugar for the former. Basic Slicing ------------- Basic slicing extends Python's basic concept of slicing to N dimensions. Basic slicing occurs when *obj* is a :class:`slice` object (constructed by ``start:stop:step`` notation inside of brackets), an integer, or a tuple of slice objects and integers. :const:`Ellipsis` and :const:`newaxis` objects can be interspersed with these as well. In order to remain backward compatible with a common usage in Numeric, basic slicing is also initiated if the selection object is any sequence (such as a :class:`list`) containing :class:`slice` objects, the :const:`Ellipsis` object, or the :const:`newaxis` object, but no integer arrays or other embedded sequences. .. index:: triple: ndarray; special methods; getslice triple: ndarray; special methods; setslice single: ellipsis single: newaxis The simplest case of indexing with *N* integers returns an :ref:`array scalar ` representing the corresponding item. As in Python, all indices are zero-based: for the *i*-th index :math:`n_i`, the valid range is :math:`0 \le n_i < d_i` where :math:`d_i` is the *i*-th element of the shape of the array. Negative indices are interpreted as counting from the end of the array (*i.e.*, if *i < 0*, it means :math:`n_i + i`). All arrays generated by basic slicing are always :term:`views ` of the original array. The standard rules of sequence slicing apply to basic slicing on a per-dimension basis (including using a step index). Some useful concepts to remember include: - The basic slice syntax is ``i:j:k`` where *i* is the starting index, *j* is the stopping index, and *k* is the step (:math:`k\neq0`). This selects the *m* elements (in the corresponding dimension) with index values *i*, *i + k*, ..., *i + (m - 1) k* where :math:`m = q + (r\neq0)` and *q* and *r* are the quotient and remainder obtained by dividing *j - i* by *k*: *j - i = q k + r*, so that *i + (m - 1) k < j*. .. admonition:: Example >>> x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) >>> x[1:7:2] array([1, 3, 5]) - Negative *i* and *j* are interpreted as *n + i* and *n + j* where *n* is the number of elements in the corresponding dimension. Negative *k* makes stepping go towards smaller indices. .. admonition:: Example >>> x[-2:10] array([8, 9]) >>> x[-3:3:-1] array([7, 6, 5, 4]) - Assume *n* is the number of elements in the dimension being sliced. Then, if *i* is not given it defaults to 0 for *k > 0* and *n* for *k < 0* . If *j* is not given it defaults to *n* for *k > 0* and -1 for *k < 0* . If *k* is not given it defaults to 1. Note that ``::`` is the same as ``:`` and means select all indices along this axis. .. admonition:: Example >>> x[5:] array([5, 6, 7, 8, 9]) - If the number of objects in the selection tuple is less than *N* , then ``:`` is assumed for any subsequent dimensions. .. admonition:: Example >>> x = np.array([[[1],[2],[3]], [[4],[5],[6]]]) >>> x.shape (2, 3, 1) >>> x[1:2] array([[[4], [5], [6]]]) - :const:`Ellipsis` expand to the number of ``:`` objects needed to make a selection tuple of the same length as ``x.ndim``. Only the first ellipsis is expanded, any others are interpreted as ``:``. .. admonition:: Example >>> x[...,0] array([[1, 2, 3], [4, 5, 6]]) - Each :const:`newaxis` object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension. The added dimension is the position of the :const:`newaxis` object in the selection tuple. .. admonition:: Example >>> x[:,np.newaxis,:,:].shape (2, 1, 3, 1) - An integer, *i*, returns the same values as ``i:i+1`` **except** the dimensionality of the returned object is reduced by 1. In particular, a selection tuple with the *p*-th element an integer (and all other entries ``:``) returns the corresponding sub-array with dimension *N - 1*. If *N = 1* then the returned object is an array scalar. These objects are explained in :ref:`arrays.scalars`. - If the selection tuple has all entries ``:`` except the *p*-th entry which is a slice object ``i:j:k``, then the returned array has dimension *N* formed by concatenating the sub-arrays returned by integer indexing of elements *i*, *i+k*, ..., *i + (m - 1) k < j*, - Basic slicing with more than one non-``:`` entry in the slicing tuple, acts like repeated application of slicing using a single non-``:`` entry, where the non-``:`` entries are successively taken (with all other non-``:`` entries replaced by ``:``). Thus, ``x[ind1,...,ind2,:]`` acts like ``x[ind1][...,ind2,:]`` under basic slicing. .. warning:: The above is **not** true for advanced slicing. - You may use slicing to set values in the array, but (unlike lists) you can never grow the array. The size of the value to be set in ``x[obj] = value`` must be (broadcastable) to the same shape as ``x[obj]``. .. index:: pair: ndarray; view .. note:: Remember that a slicing tuple can always be constructed as *obj* and used in the ``x[obj]`` notation. Slice objects can be used in the construction in place of the ``[start:stop:step]`` notation. For example, ``x[1:10:5,::-1]`` can also be implemented as ``obj = (slice(1,10,5), slice(None,None,-1)); x[obj]`` . This can be useful for constructing generic code that works on arrays of arbitrary dimension. .. data:: newaxis The :const:`newaxis` object can be used in the basic slicing syntax discussed above. :const:`None` can also be used instead of :const:`newaxis`. Advanced indexing ----------------- Advanced indexing is triggered when the selection object, *obj*, is a non-tuple sequence object, an :class:`ndarray` (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean. Advanced indexing always returns a *copy* of the data (contrast with basic slicing that returns a :term:`view`). Integer ^^^^^^^ Integer indexing allows selection of arbitrary items in the array based on their *N*-dimensional index. This kind of selection occurs when advanced indexing is triggered and the selection object is not an array of data type bool. For the discussion below, when the selection object is not a tuple, it will be referred to as if it had been promoted to a 1-tuple, which will be called the selection tuple. The rules of advanced integer-style indexing are: - If the length of the selection tuple is larger than *N* an error is raised. - All sequences and scalars in the selection tuple are converted to :class:`intp` indexing arrays. - All selection tuple objects must be convertible to :class:`intp` arrays, :class:`slice` objects, or the :const:`Ellipsis` object. - The first :const:`Ellipsis` object will be expanded, and any other :const:`Ellipsis` objects will be treated as full slice (``:``) objects. The expanded :const:`Ellipsis` object is replaced with as many full slice (``:``) objects as needed to make the length of the selection tuple :math:`N`. - If the selection tuple is smaller than *N*, then as many ``:`` objects as needed are added to the end of the selection tuple so that the modified selection tuple has length *N*. - All the integer indexing arrays must be :ref:`broadcastable ` to the same shape. - The shape of the output (or the needed shape of the object to be used for setting) is the broadcasted shape. - After expanding any ellipses and filling out any missing ``:`` objects in the selection tuple, then let :math:`N_t` be the number of indexing arrays, and let :math:`N_s = N - N_t` be the number of slice objects. Note that :math:`N_t > 0` (or we wouldn't be doing advanced integer indexing). - If :math:`N_s = 0` then the *M*-dimensional result is constructed by varying the index tuple ``(i_1, ..., i_M)`` over the range of the result shape and for each value of the index tuple ``(ind_1, ..., ind_M)``:: result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M], ..., ind_N[i_1, ..., i_M]] .. admonition:: Example Suppose the shape of the broadcasted indexing arrays is 3-dimensional and *N* is 2. Then the result is found by letting *i, j, k* run over the shape found by broadcasting ``ind_1`` and ``ind_2``, and each *i, j, k* yields:: result[i,j,k] = x[ind_1[i,j,k], ind_2[i,j,k]] - If :math:`N_s > 0`, then partial indexing is done. This can be somewhat mind-boggling to understand, but if you think in terms of the shapes of the arrays involved, it can be easier to grasp what happens. In simple cases (*i.e.* one indexing array and *N - 1* slice objects) it does exactly what you would expect (concatenation of repeated application of basic slicing). The rule for partial indexing is that the shape of the result (or the interpreted shape of the object to be used in setting) is the shape of *x* with the indexed subspace replaced with the broadcasted indexing subspace. If the index subspaces are right next to each other, then the broadcasted indexing space directly replaces all of the indexed subspaces in *x*. If the indexing subspaces are separated (by slice objects), then the broadcasted indexing space is first, followed by the sliced subspace of *x*. .. admonition:: Example Suppose ``x.shape`` is (10,20,30) and ``ind`` is a (2,3,4)-shaped indexing :class:`intp` array, then ``result = x[...,ind,:]`` has shape (10,2,3,4,30) because the (20,)-shaped subspace has been replaced with a (2,3,4)-shaped broadcasted indexing subspace. If we let *i, j, k* loop over the (2,3,4)-shaped subspace then ``result[...,i,j,k,:] = x[...,ind[i,j,k],:]``. This example produces the same result as :meth:`x.take(ind, axis=-2) `. .. admonition:: Example Now let ``x.shape`` be (10,20,30,40,50) and suppose ``ind_1`` and ``ind_2`` are broadcastable to the shape (2,3,4). Then ``x[:,ind_1,ind_2]`` has shape (10,2,3,4,40,50) because the (20,30)-shaped subspace from X has been replaced with the (2,3,4) subspace from the indices. However, ``x[:,ind_1,:,ind_2]`` has shape (2,3,4,10,30,50) because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use :meth:`.transpose() ` to move the subspace anywhere desired. (Note that this example cannot be replicated using :func:`take`.) Boolean ^^^^^^^ This advanced indexing occurs when obj is an array object of Boolean type (such as may be returned from comparison operators). It is always equivalent to (but faster than) ``x[obj.nonzero()]`` where, as described above, :meth:`obj.nonzero() ` returns a tuple (of length :attr:`obj.ndim `) of integer index arrays showing the :const:`True` elements of *obj*. The special case when ``obj.ndim == x.ndim`` is worth mentioning. In this case ``x[obj]`` returns a 1-dimensional array filled with the elements of *x* corresponding to the :const:`True` values of *obj*. The search order will be C-style (last index varies the fastest). If *obj* has :const:`True` values at entries that are outside of the bounds of *x*, then an index error will be raised. You can also use Boolean arrays as element of the selection tuple. In such instances, they will always be interpreted as :meth:`nonzero(obj) ` and the equivalent integer indexing will be done. .. warning:: The definition of advanced indexing means that ``x[(1,2,3),]`` is fundamentally different than ``x[(1,2,3)]``. The latter is equivalent to ``x[1,2,3]`` which will trigger basic selection while the former will trigger advanced indexing. Be sure to understand why this is occurs. Also recognize that ``x[[1,2,3]]`` will trigger advanced indexing, whereas ``x[[1,2,slice(None)]]`` will trigger basic slicing. .. _arrays.indexing.rec: Record Access ------------- .. seealso:: :ref:`arrays.dtypes`, :ref:`arrays.scalars` If the :class:`ndarray` object is a record array, *i.e.* its data type is a :term:`record` data type, the :term:`fields ` of the array can be accessed by indexing the array with strings, dictionary-like. Indexing ``x['field-name']`` returns a new :term:`view` to the array, which is of the same shape as *x* (except when the field is a sub-array) but of data type ``x.dtype['field-name']`` and contains only the part of the data in the specified field. Also record array scalars can be "indexed" this way. If the accessed field is a sub-array, the dimensions of the sub-array are appended to the shape of the result. .. admonition:: Example >>> x = np.zeros((2,2), dtype=[('a', np.int32), ('b', np.float64, (3,3))]) >>> x['a'].shape (2, 2) >>> x['a'].dtype dtype('int32') >>> x['b'].shape (2, 2, 3, 3) >>> x['b'].dtype dtype('float64') Flat Iterator indexing ---------------------- :attr:`x.flat ` returns an iterator that will iterate over the entire array (in C-contiguous style with the last index varying the fastest). This iterator object can also be indexed using basic slicing or advanced indexing as long as the selection object is not a tuple. This should be clear from the fact that :attr:`x.flat ` is a 1-dimensional view. It can be used for integer indexing with 1-dimensional C-style-flat indices. The shape of any returned array is therefore the shape of the integer indexing object. .. index:: single: indexing single: ndarray