HiveQL – ORDER BY and SORT BY Clause
By using HiveQL ORDER BY and SORT BY clause, we can apply sort on the column. It returns the result set either in ascending or descending order. Here, we are going to execute these clauses on the records of the below table:
HiveQL – ORDER BY Clause
In HiveQL, ORDER BY clause performs a complete ordering of the query result set. Hence, the complete data is passed through a single reducer. This may take much time in the execution of large datasets. However, we can use LIMIT to minimize the sorting time.
Example of ORDER BY Clause in Hive
Let’s see an example to arrange the data in the sorted order by using ORDER BY clause.
- Select the database in which we want to create a table.
- Now, create a table by using the following command:
- Load the data into the table.
- Now, fetch the data in the descending order by using the following command:
Here, we got the desired result.
HiveQL – SORT BY Clause
The HiveQL SORT BY clause is an alternative of ORDER BY clause. It orders the data within each reducer. Hence, it performs the local ordering, where each reducer’s output is sorted separately. It may also give a partially ordered result.
Example of SORT BY Clause in Hive
In this example, we arrange the data in the sorted order by using SORT BY clause.
- Let’s fetch the data in the descending order by using the following command:
Here, we got the desired result.