Home » Pandas Reindex

Reindex

The main task of the Pandas reindex is to conform DataFrame to a new index with optional filling logic and to place NA/NaN in that location where the values are not present in the previous index. It returns a new object unless the new index is produced as an equivalent to the current one, and the value of copy becomes False.

Reindexing is used to change the index of the rows and columns of the DataFrame. We can reindex the single or multiple rows by using the reindex() method. Default values in the new index are assigned NaN if it is not present in the DataFrame.

Syntax:

Parameters:

labels: It is an optional parameter that refers to the new labels or the index to conform to the axis that is specified by the ‘axis’.

index, columns : It is also an optional parameter that refers to the new labels or the index. It generally prefers an index object for avoiding the duplicate data.

axis : It is also an optional parameter that targets the axis and can be either the axis name or the numbers.

method: It is also an optional parameter that is to be used for filling the holes in the reindexed DataFrame. It can only be applied to the DataFrame or Series with a monotonically increasing/decreasing order.

None: It is a default value that does not fill the gaps.

pad / ffill: It is used to propagate the last valid observation forward to the next valid observation.

backfill / bfill: To fill the gap, It uses the next valid observation.

nearest: To fill the gap, it uses the next valid observation.

copy: Its default value is True and returns a new object as a boolean value, even if the passed indexes are the same.

level : It is used to broadcast across the level, and match index values on the passed MultiIndex level.

fill_value : Its default value is np.NaN and used to fill existing missing (NaN) values. It needs any new element for successful DataFrame alignment, with this value before computation.

limit : It defines the maximum number of consecutive elements that are to be forward or backward fill.

tolerance : It is also an optional parameter that determines the maximum distance between original and new labels for inexact matches. At the matching locations, the values of the index should most satisfy the equation abs(index[indexer] ? target) <= tolerance.

Returns :

It returns reindexed DataFrame.

Example 1:

The below example shows the working of reindex() function to reindex the dataframe. In the new index,default values are assigned NaN in the new index that does not have corresponding records in the DataFrame.

Note: We can use fill_value for filling the missing values.

Output:

         A    B    D    E  ParkerNaN  NaN  NaN  NaN  WilliamNaN  NaN  NaN  NaN  SmithNaN  NaN  NaN  NaN  TerryNaN  NaN  NaN  NaN  PhillNaN  NaN  NaN  NaN  

Now, we can use the dataframe.reindex() function to reindex the dataframe.

Output:

PQRS  ANaNNaNNaNNaN  BNaNNaNNaNNaN  CNaNNaNNaNNaN  DNaNNaNNaNNaN  ENaNNaNNaNNaN  

Notice that the new indexes are populated with NaN values. We can fill in the missing values using the fill_value parameter.

Output:

PQRS  A100100100100  B100100100100  C100100100100  D100100100100  E100100100100  

Example 2:

This example shows the working of reindex() function to reindex the column axis.

Output:

        A     B    D    E  ParkerNaN  NaN  NaN  NaN  WilliamNaN  NaN  NaN  NaN  SmithNaN  NaN  NaN  NaN  TerryNaN  NaN  NaN  NaN  PhillNaN  NaN  NaN  NaN  

Notice that NaN values are present in the new columns after reindexing, we can use the argument fill_value to the function for removing the NaN values.

Output:

          A   B   D   E  Parker37  37  37  37  William37  37  37  37  Smith37  37  37  37  Terry37  37  37  37  Phill37  37  37  37  

Next TopicReset Index

You may also like