Returns the cross entropy H(q, p) of the empirical distribution q of the data (with the given feature matrix fx) with respect to the model p. For discrete distributions this is defined as:
H(q, p) = - n^{-1} sum_{j=1}^n log p(x_j)
where x_j are the data elements assumed drawn from q whose features are given by the matrix fx = {f(x_j)}, j=1,...,n.
The ‘base’ argument specifies the base of the logarithm, which defaults to e.
For continuous distributions this makes no sense!