W3cubDocs

pandas.crosstab

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

Compute a simple cross-tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed

Parameters:

Parameters:	index : array-like, Series, or list of arrays/Series Values to group by in the rows columns : array-like, Series, or list of arrays/Series Values to group by in the columns values : array-like, optional Array of values to aggregate according to the factors. Requires `aggfunc` be specified. aggfunc : function, optional If specified, requires `values` be specified as well rownames : sequence, default None If passed, must match number of row arrays passed colnames : sequence, default None If passed, must match number of column arrays passed margins : boolean, default False Add row/column margins (subtotals) dropna : boolean, default True Do not include columns whose entries are all NaN normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False Normalize by dividing all values by the sum of values. If passed ‘all’ or `True`, will normalize over all values. If passed ‘index’ will normalize over each row. If passed ‘columns’ will normalize over each column. If margins is `True`, will also normalize margin values. New in version 0.18.1.
Returns:	crosstab : DataFrame

index : array-like, Series, or list of arrays/Series

Values to group by in the rows

columns : array-like, Series, or list of arrays/Series

Values to group by in the columns

values : array-like, optional

Array of values to aggregate according to the factors. Requires aggfunc be specified.

aggfunc : function, optional

If specified, requires values be specified as well

rownames : sequence, default None

If passed, must match number of row arrays passed

colnames : sequence, default None

If passed, must match number of column arrays passed

margins : boolean, default False

Add row/column margins (subtotals)

dropna : boolean, default True

Do not include columns whose entries are all NaN

normalize : boolean, {‘all’, ‘index’, ‘columns’}, or {0,1}, default False

Normalize by dividing all values by the sum of values.

If passed ‘all’ or True, will normalize over all values.
If passed ‘index’ will normalize over each row.
If passed ‘columns’ will normalize over each column.
If margins is True, will also normalize margin values.

New in version 0.18.1.

Returns:

crosstab : DataFrame

Notes

Any Series passed will have their name attributes used unless row or column names for the cross-tabulation are specified

In the event that there aren’t overlapping indexes an empty DataFrame will be returned.

Examples

>>> a
array([foo, foo, foo, foo, bar, bar,
       bar, bar, foo, foo, foo], dtype=object)
>>> b
array([one, one, one, two, one, one,
       one, two, two, two, one], dtype=object)
>>> c
array([dull, dull, shiny, dull, dull, shiny,
       shiny, dull, shiny, shiny, shiny], dtype=object)

>>> crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])
b    one          two
c    dull  shiny  dull  shiny
a
bar  1     2      1     0
foo  2     2      1     2

© 2011–2012 Lambda Foundry, Inc. and PyData Development Team
© 2008–2011 AQR Capital Management, LLC
© 2008–2014 the pandas development team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.crosstab.html