Pandas DataFrame.sort()
We can efficiently perform sorting in the DataFrame through different kinds:
- By label
- By Actual value
Before explaining these two kinds of sorting, first we have to take the dataset for demonstration:
Output
col2 col1 1 -0.456763 -0.931156 3 0.242766 -0.793590 7 1.133803 0.454363 2 -0.843520 -0.938268 4 -0.018571 -0.315972 5 -1.951544 -1.300100 9 -0.711499 0.031491 8 1.648080 0.695637 0 2.576250 -0.625171 6 -0.301717 0.879970
In the above DataFrame, the labels and the values are unsorted. So, let’s see how it can be sorted:
- By label
The DataFrame can be sorted by using the sort_index() method. It can be done by passing the axis arguments and the order of sorting. The sorting is done on row labels in ascending order by default.
Example
Output
col4 col3 0 0.698346 1.897573 1 1.247655 -1.208908 2 -0.469820 -0.546918 3 -0.793445 0.362020 4 -1.184855 -1.596489 5 1.500156 -0.397635 6 -1.239635 -0.255545 7 1.110986 -0.681728 8 -1.797474 0.108840 9 0.063048 1.512421
- Order of Sorting
The order of sorting can be controlled by passing the Boolean value to the ascending parameter.
Example:
Output
col4 col5 1 0.664336 -1.846533 4 -0.456203 -1.255311 7 0.537063 -0.774384 2 -1.937455 0.257315 5 0.331764 -0.741020 3 -0.082334 0.304390 0 -0.983810 -0.711582 8 0.208479 -1.234640 9 0.656063 0.122720 6 0.347990 -0.410401
- Sort the Columns:
We can sort the columns labels by passing the axis argument respected to its values 0 or 1. By default, the axis=0, it sort by row.
Example:
Output
col4 col7 1 -0.509367 -1.609514 4 -0.516731 0.397375 8 -0.201157 -0.009864 2 1.440567 1.058436 0 0.955486 -0.009777 6 -1.211133 0.415147 7 0.095644 0.531727 5 -0.881241 -0.871342 3 0.206327 -1.154724 9 1.418127 0.146788
By Actual Value
It is another kind through which sorting can be performed in the DataFrame. Like index sorting, sort_values() is a method for sorting by the values.
It also provides a feature in which we can specify the column name of the DataFrame with which values are to be sorted. It is done by passing the ‘by‘ argument.
Example:
Output
col1 col2 2 8 4 0 7 8 3 3 9 1 1 12
In the above output, observe that the values are sorted in col2 only, and the respective col1 value and row index will alter along with col2. Thus, they look unsorted.
Parameters
- columns: Before Sorting, you have to pass an object or the column names.
- ascending: A Boolean value is passed that is responsible for sorting in the ascending order. Its default value is True.
- axis: 0 or index; 1 or ‘columns’. The default value is 0. It decides whether you sort by index or columns.
- inplace: A Boolean value is passed. The default value is false. It will modify any other views on this object and does not create a new instance while sorting the DataFrame.
- kind: ‘heapsort’, ‘mergesort’, ‘quicksort’. It is an optional parameter that is to be applied only when you sort a single column or labels.
- na_position: ‘first’, ‘last’. The ‘first’ puts NaNs at the beginning, while the ‘last’ puts NaNs at the end. Default option last.