csr_matrix#
- class scipy.sparse.csr_matrix(arg1, shape=None, dtype=None, copy=False)[source]#
Compressed Sparse Row matrix.
- This can be instantiated in several ways:
- csr_matrix(D)
where D is a 2-D ndarray
- csr_matrix(S)
with another sparse array or matrix S (equivalent to S.tocsr())
- csr_matrix((M, N), [dtype])
to construct an empty matrix with shape (M, N) dtype is optional, defaulting to dtype=’d’.
- csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])
where
data
,row_ind
andcol_ind
satisfy the relationshipa[row_ind[k], col_ind[k]] = data[k]
.- csr_matrix((data, indices, indptr), [shape=(M, N)])
is the standard CSR representation where the column indices for row i are stored in
indices[indptr[i]:indptr[i+1]]
and their corresponding values are stored indata[indptr[i]:indptr[i+1]]
. If the shape parameter is not supplied, the matrix dimensions are inferred from the index arrays.
- Attributes:
- dtypedtype
Data type of the matrix
shape
2-tupleShape of the matrix
- ndimint
Number of dimensions (this is always 2)
nnz
Number of stored values, including explicit zeros.
size
Number of stored values.
- data
CSR format data array of the matrix
- indices
CSR format index array of the matrix
- indptr
CSR format index pointer array of the matrix
has_sorted_indices
Whether the indices are sorted
has_canonical_format
Whether the array/matrix has sorted indices and no duplicates
T
Transpose.
Methods
__getitem__
(key)__len__
()__mul__
(other)arcsin
()Element-wise arcsin.
arcsinh
()Element-wise arcsinh.
arctan
()Element-wise arctan.
arctanh
()Element-wise arctanh.
argmax
([axis, out])Return indices of maximum elements along an axis.
argmin
([axis, out])Return indices of minimum elements along an axis.
asformat
(format[, copy])Return this array/matrix in the passed format.
asfptype
()Upcast matrix to a floating point format (if necessary)
astype
(dtype[, casting, copy])Cast the array/matrix elements to a specified type.
ceil
()Element-wise ceil.
check_format
([full_check])Check whether the array/matrix respects the CSR or CSC format.
conj
([copy])Element-wise complex conjugation.
conjugate
([copy])Element-wise complex conjugation.
copy
()Returns a copy of this array/matrix.
Number of non-zero entries, equivalent to
deg2rad
()Element-wise deg2rad.
diagonal
([k])Returns the kth diagonal of the array/matrix.
dot
(other)Ordinary dot product
Remove zero entries from the array/matrix
expm1
()Element-wise expm1.
floor
()Element-wise floor.
getH
()Return the Hermitian transpose of this matrix.
Get the shape of the matrix
getcol
(j)Returns a copy of column j of the matrix, as an (m x 1) sparse matrix (column vector).
Matrix storage format
Maximum number of elements to display when printed.
getnnz
([axis])Number of stored values, including explicit zeros.
getrow
(i)Returns a copy of row i of the matrix, as a (1 x n) sparse matrix (row vector).
log1p
()Element-wise log1p.
max
([axis, out])Return the maximum of the array/matrix or maximum along an axis.
maximum
(other)Element-wise maximum between this and another array/matrix.
mean
([axis, dtype, out])Compute the arithmetic mean along the specified axis.
min
([axis, out])Return the minimum of the array/matrix or maximum along an axis.
minimum
(other)Element-wise minimum between this and another array/matrix.
multiply
(other)Point-wise multiplication by array/matrix, vector, or scalar.
nanmax
([axis, out])Return the maximum of the array/matrix or maximum along an axis, ignoring any NaNs.
nanmin
([axis, out])Return the minimum of the array/matrix or minimum along an axis, ignoring any NaNs.
nonzero
()Nonzero indices of the array/matrix.
power
(n[, dtype])This function performs element-wise power.
prune
()Remove empty space after all non-zero elements.
rad2deg
()Element-wise rad2deg.
reshape
(self, shape[, order, copy])Gives a new shape to a sparse array/matrix without changing its data.
resize
(*shape)Resize the array/matrix in-place to dimensions given by
shape
rint
()Element-wise rint.
set_shape
(shape)Set the shape of the matrix in-place
setdiag
(values[, k])Set diagonal or off-diagonal elements of the array/matrix.
sign
()Element-wise sign.
sin
()Element-wise sin.
sinh
()Element-wise sinh.
Sort the indices of this array/matrix in place
Return a copy of this array/matrix with sorted indices
sqrt
()Element-wise sqrt.
sum
([axis, dtype, out])Sum the array/matrix elements over a given axis.
Eliminate duplicate entries by adding them together
tan
()Element-wise tan.
tanh
()Element-wise tanh.
toarray
([order, out])Return a dense ndarray representation of this sparse array/matrix.
tobsr
([blocksize, copy])Convert this array/matrix to Block Sparse Row format.
tocoo
([copy])Convert this array/matrix to COOrdinate format.
tocsc
([copy])Convert this array/matrix to Compressed Sparse Column format.
tocsr
([copy])Convert this array/matrix to Compressed Sparse Row format.
todense
([order, out])Return a dense representation of this sparse array/matrix.
todia
([copy])Convert this array/matrix to sparse DIAgonal format.
todok
([copy])Convert this array/matrix to Dictionary Of Keys format.
tolil
([copy])Convert this array/matrix to List of Lists format.
trace
([offset])Returns the sum along diagonals of the sparse array/matrix.
transpose
([axes, copy])Reverses the dimensions of the sparse array/matrix.
trunc
()Element-wise trunc.
Notes
Sparse matrices can be used in arithmetic operations: they support addition, subtraction, multiplication, division, and matrix power.
- Advantages of the CSR format
efficient arithmetic operations CSR + CSR, CSR * CSR, etc.
efficient row slicing
fast matrix vector products
- Disadvantages of the CSR format
slow column slicing operations (consider CSC)
changes to the sparsity structure are expensive (consider LIL or DOK)
- Canonical Format
Within each row, indices are sorted by column.
There are no duplicate entries.
Examples
>>> import numpy as np >>> from scipy.sparse import csr_matrix >>> csr_matrix((3, 4), dtype=np.int8).toarray() array([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]], dtype=int8)
>>> row = np.array([0, 0, 1, 2, 2, 2]) >>> col = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
>>> indptr = np.array([0, 2, 3, 6]) >>> indices = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, indices, indptr), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]])
Duplicate entries are summed together:
>>> row = np.array([0, 1, 2, 0]) >>> col = np.array([0, 1, 1, 0]) >>> data = np.array([1, 2, 4, 8]) >>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray() array([[9, 0, 0], [0, 2, 0], [0, 4, 0]])
As an example of how to construct a CSR matrix incrementally, the following snippet builds a term-document matrix from texts:
>>> docs = [["hello", "world", "hello"], ["goodbye", "cruel", "world"]] >>> indptr = [0] >>> indices = [] >>> data = [] >>> vocabulary = {} >>> for d in docs: ... for term in d: ... index = vocabulary.setdefault(term, len(vocabulary)) ... indices.append(index) ... data.append(1) ... indptr.append(len(indices)) ... >>> csr_matrix((data, indices, indptr), dtype=int).toarray() array([[2, 1, 0, 0], [0, 1, 1, 1]])