pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False) Compute a simple cross-tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed
| Parameters: |
index : array-like, Series, or list of arrays/Series Values to group by in the rows columns : array-like, Series, or list of arrays/Series Values to group by in the columns values : array-like, optional Array of values to aggregate according to the factors. Requires aggfunc : function, optional If specified, requires rownames : sequence, default None If passed, must match number of row arrays passed colnames : sequence, default None If passed, must match number of column arrays passed margins : boolean, default False Add row/column margins (subtotals) dropna : boolean, default True Do not include columns whose entries are all NaN normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False Normalize by dividing all values by the sum of values.
New in version 0.18.1. |
|---|---|
| Returns: |
crosstab : DataFrame |
Any Series passed will have their name attributes used unless row or column names for the cross-tabulation are specified
In the event that there aren’t overlapping indexes an empty DataFrame will be returned.
>>> a
array([foo, foo, foo, foo, bar, bar,
bar, bar, foo, foo, foo], dtype=object)
>>> b
array([one, one, one, two, one, one,
one, two, two, two, one], dtype=object)
>>> c
array([dull, dull, shiny, dull, dull, shiny,
shiny, dull, shiny, shiny, shiny], dtype=object)
>>> crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c']) b one two c dull shiny dull shiny a bar 1 2 1 0 foo 2 2 1 2
© 2011–2012 Lambda Foundry, Inc. and PyData Development Team
© 2008–2011 AQR Capital Management, LLC
© 2008–2014 the pandas development team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.crosstab.html