SciPy

scipy.sparse.csgraph.maximum_bipartite_matching

scipy.sparse.csgraph.maximum_bipartite_matching(graph, perm_type='row')

Returns a matching of a bipartite graph whose cardinality is as least that of any given matching of the graph.

Parameters
graphsparse matrix

Input sparse in CSR format whose rows represent one partition of the graph and whose columns represent the other partition. An edge between two vertices is indicated by the corresponding entry in the matrix existing in its sparse representation.

perm_typestr, {‘row’, ‘column’}

Which partition to return the matching in terms of: If 'row', the function produces an array whose length is the number of columns in the input, and whose \(j\)’th element is the row matched to the \(j\)’th column. Conversely, if perm_type is 'column', this returns the columns matched to each row.

Returns
permndarray

A matching of the vertices in one of the two partitions. Unmatched vertices are represented by a -1 in the result.

Notes

This function implements the Hopcroft–Karp algorithm [1]. Its time complexity is \(O(\lvert E \rvert \sqrt{\lvert V \rvert})\), and its space complexity is linear in the number of rows. In practice, this asymmetry between rows and columns means that it can be more efficient to transpose the input if it contains more columns than rows.

By Konig’s theorem, the cardinality of the matching is also the number of vertices appearing in a minimum vertex cover of the graph.

Note that if the sparse representation contains explicit zeros, these are still counted as edges.

The implementation was changed in SciPy 1.4.0 to allow matching of general bipartite graphs, where previous versions would assume that a perfect matching existed. As such, code written against 1.4.0 will not necessarily work on older versions.

References

1

John E. Hopcroft and Richard M. Karp. “An n^{5 / 2} Algorithm for Maximum Matchings in Bipartite Graphs” In: SIAM Journal of Computing 2.4 (1973), pp. 225–231. <https://dx.doi.org/10.1137/0202019>.

Examples

>>> from scipy.sparse import csr_matrix
>>> from scipy.sparse.csgraph import maximum_bipartite_matching

As a simple example, consider a bipartite graph in which the partitions contain 2 and 3 elements respectively. Suppose that one partition contains vertices labelled 0 and 1, and that the other partition contains vertices labelled A, B, and C. Suppose that there are edges connecting 0 and C, 1 and A, and 1 and B. This graph would then be represented by the following sparse matrix:

>>> graph = csr_matrix([[0, 0, 1], [1, 1, 0]])

Here, the 1s could be anything, as long as they end up being stored as elements in the sparse matrix. We can now calculate maximum matchings as follows:

>>> print(maximum_bipartite_matching(graph, perm_type='column'))
[2 0]
>>> print(maximum_bipartite_matching(graph, perm_type='row'))
[ 1 -1  0]

The first output tells us that 1 and 2 are matched with C and A respectively, and the second output tells us that A, B, and C are matched with 1, nothing, and 0 respectively.

Note that explicit zeros are still converted to edges. This means that a different way to represent the above graph is by using the CSR structure directly as follows:

>>> data = [0, 0, 0]
>>> indices = [2, 0, 1]
>>> indptr = [0, 1, 3]
>>> graph = csr_matrix((data, indices, indptr))
>>> print(maximum_bipartite_matching(graph, perm_type='column'))
[2 0]
>>> print(maximum_bipartite_matching(graph, perm_type='row'))
[ 1 -1  0]

When one or both of the partitions are empty, the matching is empty as well:

>>> graph = csr_matrix((2, 0))
>>> print(maximum_bipartite_matching(graph, perm_type='column'))
[-1 -1]
>>> print(maximum_bipartite_matching(graph, perm_type='row'))
[]

When the input matrix is square, and the graph is known to admit a perfect matching, i.e. a matching with the property that every vertex in the graph belongs to some edge in the matching, then one can view the output as the permutation of rows (or columns) turning the input matrix into one with the property that all diagonal elements are non-empty:

>>> a = [[0, 1, 2, 0], [1, 0, 0, 1], [2, 0, 0, 3], [0, 1, 3, 0]]
>>> graph = csr_matrix(a)
>>> perm = maximum_bipartite_matching(graph, perm_type='row')
>>> print(graph[perm].toarray())
[[1 0 0 1]
 [0 1 2 0]
 [0 1 3 0]
 [2 0 0 3]]

Previous topic

scipy.sparse.csgraph.maximum_flow

Next topic

scipy.sparse.csgraph.structural_rank