Sorting on Multiple Variables
In the last topic, we learned PROC sort in SAS and saw that we could sort data values in ascending or descending order and also on the basis of our own chosen variable. We also studied that we can perform sorting by taking multiple variables simultaneously, but we have not done it practically.
Now, let’s see how to sort data values by using multiple variables and what its limitations are.
SAS allows multiple variable sorting, so we can sort multiple variables at the same time. But just think, if we instruct SAS to sort multiple variables in ascending (or default) order, and all contain integer values so, which variable SAS should choose for sorting. Let’s understand via an example:
In the above example, we are giving two variables roll and age for sorting, and both are integer values, so which variable SAS should choose for sorting, roll or age. The answer is, SAS will consider first declared variable, i.e. roll, and if the first variable has two or more than two same values then it will consider the second variable for sorting.
Suppose, there is a dataset named student and it includes data of students of a class such as a roll number, name, game, age, weight, and height. For sorting, we are using two variables, i.e. weight and age, and sorting order is default.
Run code in SAS studio:
In the output, we can see that the sorting is done on the basis of weight. But the weight of three students is the same so, in this situation, it considers age (for the sorting of only these three same values and after these values it will jump on the weight again) that is second declared variable.
See another example where the variable name that contains alphabetical values, is mentioned first.
Run code on SAS studio:
We can see in the output it is also sorted according to the variable name that is mentioned first.
There is a limitation in the SAS sorting. In a case, when we refer a variable for sorting which contains alphabetical data values, but some values start with upper case letters and some are with lower case letters, then the sorting is done in the order like that, upper case letter sorted first and lower case letter in the last. Let’s see the example:
Run code on SAS studio:
We can see in the output the data values start with upper case letter sorted first and the values start with lower case letter sorted in the last.