scipy.stats.random_table#
- scipy.stats.random_table = <scipy.stats._multivariate.random_table_gen object>[source]#
Contingency tables from independent samples with fixed marginal sums.
This is the distribution of random tables with given row and column vector sums. This distribution represents the set of random tables under the null hypothesis that rows and columns are independent. It is used in hypothesis tests of independence.
Because of assumed independence, the expected frequency of each table element can be computed from the row and column sums, so that the distribution is completely determined by these two vectors.
- Parameters:
- rowarray_like
Sum of table entries in each row.
- colarray_like
Sum of table entries in each column.
- seed{None, int, np.random.RandomState, np.random.Generator}, optional
Used for drawing random variates. If seed is None, the RandomState singleton is used. If seed is an int, a new
RandomState
instance is used, seeded with seed. If seed is already aRandomState
orGenerator
instance, then that object is used. Default is None.
Methods
logpmf(x)
Log-probability of table x to occur in the distribution.
pmf(x)
Probability of table x to occur in the distribution.
mean(row, col)
Mean table.
rvs(row, col, size=None, method=None, random_state=None)
Draw random tables with given row and column vector sums.
Notes
The row and column vectors must be one-dimensional, not empty, and each sum up to the same value. They cannot contain negative or noninteger entries.
Random elements from the distribution are generated either with Boyett’s [1] or Patefield’s algorithm [2]. Boyett’s algorithm has O(N) time and space complexity, where N is the total sum of entries in the table. Patefield’s algorithm has O(K x log(N)) time complexity, where K is the number of cells in the table and requires only a small constant work space. By default, the rvs method selects the fastest algorithm based on the input, but you can specify the algorithm with the keyword method. Allowed values are “boyett” and “patefield”.
Added in version 1.10.0.
References
[1]Boyett, AS 144 Appl. Statist. 28 (1979) 329-332
[2]W.M. Patefield, AS 159 Appl. Statist. 30 (1981) 91-97
Examples
>>> from scipy.stats import random_table
>>> row = [1, 5] >>> col = [2, 3, 1] >>> random_table.mean(row, col) array([[0.33333333, 0.5 , 0.16666667], [1.66666667, 2.5 , 0.83333333]])
Alternatively, the object may be called (as a function) to fix the row and column vector sums, returning a “frozen” distribution.
>>> dist = random_table(row, col) >>> dist.rvs(random_state=123) array([[1., 0., 0.], [1., 3., 1.]])