MulDataFrame.drop_duplicates#

MulDataFrame.drop_duplicates(subset=None, mloc=None, keep='first', inplace=False)#

Return MulDataFrame with duplicate values removed.

It is similar to DataFrame.drop_duplciates except it returns a MulDataFrame with the index dataframe properly sliced.

Parameters#

subsetpirmary columns label or sequence of primary columns labels, optional

Only consider certain columns specified by the primary columns labels for identifying duplicates, by default use all of the columns.

mlocarray or dict

Only consider certain columns specified by the mloc Hierachical indexing for identifying duplicates. check mloc for possible values. This parameter is ignored if keys is not None.

keep{‘first’, ‘last’, False}, default ‘first’

Method to handle dropping duplicates:

  • ‘first’ : Drop duplicates except for the first occurrence.

  • ‘last’ : Drop duplicates except for the last occurrence.

  • False : Drop all duplicates.

inplacebool, default False

If True, performs operation inplace and returns None.

Returns#

MulDataFrame or None

If inplace=True, returns None. Otherwise, returns a MulDataFrame. The new MulDataFrame’ index dataframe is properly sliced according to removed values.

Examples#

>>> import pandas as pd
>>> import muldataframe as md
>>> index = pd.DataFrame([[1,2],[3,6],[5,6]],
             index=['a','b','b'],
             columns=['x','y'])
>>> columns = pd.DataFrame([[5,7],[3,6]],
                index=['c','d'],
                columns=['f','g'])
>>> mf = md.MulDataFrame([[1,2],[8,9],[9,10]],index=index,columns=columns)
>>> mf.drop_duplicates(mloc={'g':7})
(2, 2)    g  7  6
          f  5  3
             c  d
--------  ---------
   x  y      c  d
a  1  2   a  1  2
b  3  6   b  8  9