104
Pandas DataFrame.drop_duplicates()
The drop_duplicates() function performs common data cleaning task that deals with duplicate values in the DataFrame. This method helps in removing duplicate values from the DataFrame.
Syntax
Parameters
- subset: It takes a column or the list of column labels. It considers only certain columns for identifying duplicates. Default value None.
- keep: It is used to control how to consider duplicate values. It has three distinct values that are as follows:
- first: It drops the duplicate values except for the first occurrence.
- last: It drops the duplicate values except for the last occurrence.
- False: It drops all the duplicates.
- inplace: Returns the boolean value. Default value is False.
If it is true, it removes the rows with duplicate values.
Return
Depending on the arguments passed, it returns the DataFrame with the removal of duplicate rows.
Example
Output
Name Age 0 Parker 21 1 Smith 32 2 William 29 3 Parker 21
Output
Name Age 0 Parker 21 1 Smith 32 2 William 29
Next TopicDataFrame.groupby()