Data Structures#
MulDataFrame#
A MulDataFrame object consists of three pandas dataframes: an index dataframe, a columns dataframe and a values dataframe. They are accessed through the .index, .columns and .df attributes of the muldataframe. While .index and .columns refer to the index and the columns dataframes, .df provides a deepcopy of the values dataframe.
>>> import pandas as pd
>>> import muldataframe as md
>>> index = pd.DataFrame([[1,2],[3,6],[5,6]],
index=['a','b','b'],
columns=['x','y'])
>>> columns = pd.DataFrame([[5,7],[3,6]],
index=['c','d'],
columns=['f','g'])
>>> mf = MulDataFrame([[1,2],[8,9],[8,7]],
index=index,columns=columns)
>>> mf
(3, 2) g 7 6
f 5 3
c d
-------- ---------
x y c d
a 1 2 a 1 2
b 3 6 b 8 9
b 5 6 b 8 7
>>> mf.columns
f g
c 5 7
d 3 6
>>> mf.df
c d
a 1 2
b 8 9
b 8 7
.ds provides a paritial copy of the values dataframe. Its values are not copied but refer to the values of the values dataframe while its index and columns are deep-copied from the values dataframe. The .values attribute of a muldataframe refers to the values of the values dataframe.
>>> mf2 = mf.copy()
>>> mf2.values[0,1] = 5
>>> mf2.ds
c d
a 5 2
b 8 9
b 8 7
The index of the index dataframe and the index of the columns dataframe are guaranteed to be the same as the index and the columns of the values dataframe. They are called the primary index and the primary columns. The .primary_index attribute refers to the index of the index dataframe. The .primary_columns refers to the index of the columns dataframe.
>>> mf2 = mf.copy()
>>> mf2.index.index = ['d','e',5]
>>> mf2.df
c d
d 1 2
e 8 9
5 8 7
>>> mf2.primary_index
['d','e',5]
>>> mf2.primary_columns
['c','d']
.mindex and .mcolumns/mf.mcols are implemented as alias for .index and .columns to help distinguish between a multi-index and a regular index. .pindex and .pcolumns/md.pcols are implemented as shorthands for .primary_index and .primary_columns. The shape of a muldataframe is the same as the shape of its underlying values dataframe
>>> mf.pcols
Index(['c', 'd'], dtype='object')
>>> mf.mindex
x y
a 1 2
b 3 6
b 5 6
>>> mf.shape
(3,2)
MulSeries#
A MulSeries object consists of one pandas dataframe and two pandas series: an index dataframe, a name series and a values series. They are accessed through the .index, .name and .ss attributes of the mulseries. While .index and .name refer to the index dataframe and the name series, .ss provides a deepcopy of the values series.
>>> import pandas as pd
>>> import muldataframe as md
>>> index = pd.DataFrame([[1,2],[3,5],[3,6]],
index=['a','b','b'],
columns=['x','y'])
>>> name = pd.Series(['g','h'],index=['e','f'], name='cc')
>>> ms = md.MulSeries([1,2,3],index=index,name=name)
>>> ms
(3,) f h
e g
cc
------- ------
x y cc
a 1 2 a 1
b 3 5 b 2
b 3 6 b 3
>>> ms.ss
cc
a 1
b 2
b 3
Similar to MulDataFrame, .ds provides a paritial copy of the values series. Its values are not copied but refer to the values of the values series while its index and name are deep-copied from the values series. The .values attribute of a mulseries refers to the values of the values series.
Similar to MulDataFrame, the index of the index dataframe and the name of the name series are guaranteed to be the same as the index and the name of the values series. They are called the primary index and the primary name. The .primary_index attribute refers to the index of the index dataframe. The .primary_name refers to the name of the name series.
.mindex and .mname are implemented as alias for .index and .name to help distinguish between a multi-index and a regular index. .pindex and .pname are implemented as shorthands for mf.primary_index and primary_name. The shape of a mulseries is the same as the shape of its underlying values series
MulDataFrame