MulSeries.groupby#

MulSeries.groupby(by=None, agg_mode: Literal['same_only'] | Literal['list'] | Literal['tuple'] = 'same_only', keep_primary=False)#

Group MulSeries by its index dataframe using a mapper or the index dataframe’s columns.

The function uses the DataFrame.groupby method of the index dataframe to create groups under the hood. The values of the MulSeries are grouped accordingly. It returns a MulGroupBy object that contains information about the groups.

Parameters#

byNone, mapping, function, label, pd.Grouper or list of such

Please refers to DataFrame.groupby for detailed information on this argument. The difference to the by argument in DataFrame.groupby is that if it is None, uses the primary index to group the MulSeries.

agg_mode‘same_only’, ‘list’,’tuple’, default to ‘same_only’

Determine how to aggregate column values in the index dataframe that are not the same in each group when calling numpy functions on or using the call method of the MulGroupBy object.

  • 'same_only': only keep columns that have the same values within each group.

  • 'list': put columns that do not have the same values within a group into a list.

  • 'tuple': similar to ‘list’, but put them into a tuple.

keep_primarybool, default False

Whether to keep primary index in the index dataframe in each group. If True, the primary index will be reset as a column and kept in the grouped dataframes. If the name of the primary index or columns is None, "primary_index" will be used as its name.

Returns#

MulGroupBy

A MulGroupBy object that contains information about the groups.

Examples#

>>> import muldataframe as md
>>> import pandas as pd
>>> index = pd.DataFrame([['a','b','c'],
                          ['g','b','f'],
                          ['b','g','h']],
                columns=['x','y','z'])
>>> name = pd.Series(['a','b'],index=['e','f'],name='cc')
>>> ms = MulSeries([1,2,3],index=index,name=name)
>>> for key, group in ms.groupby('y'):
...     print(key,'\n',group)
...     break
b
(2,)        f  b
            e  a
               cc
----------  ------
   x  y  z     cc
0  a  b  c  0   1
1  g  b  f  1   2
>>> ms.groupby('y').sum()
(2,)             f   b
                 e   a
                    cc
---------------  ------
Empty DataFrame     cc
Columns: []      y
Index: [b, g]    b   3
                 g   3
>>> ms.groupby('y',agg_mode='list').sum()
(2,)               f   b
                   e   a
                      cc
-----------------  ------
        x       z     cc
y                  y
b  [a, g]  [c, f]  b   3
g       b       h  g   3