Home » Concatenate Multiple Data sets in SAS

Concatenate Multiple Data sets in SAS

by Online Tutorials Library

Concatenate Multiple Data sets

In the previous topic, we have learned to write SAS data-sets in different file formats. Now, in this topic, we are going to learn how to concatenate multiple data sets into a single data-set in the SAS programming language.

Suppose, you have many observations in different data sets, and you need to collect all the observations in a single data set, then what will you do. To do so, SAS facilitates you to concatenate different data sets into a single data set.

The SET statement is used to concatenate different data sets into a single data set. Concatenated data set stores sum of all the observations of original data sets. All the observations in concatenated data set follow a proper storing sequence like this, on the first place, all the observations of the first data set, on the second place, all the observations of the second data set, and so on.

Ideally, all the combining data sets should have the same number of variables, but if they have the different number of variables, then all the variables will appear in the result, with the missing value for the small data set.

Syntax:

Where,

Set: It is a statement that is used to concatenate different data sets into a single data set.

Data-set 1 data-set 2: These are the names of the data sets that are being concatenated.

Now, let’s understand through an example, how we can concatenate data sets:

Let’s consider the data of employees of an organization that is available in two separate data sets, one for the Training department and another for the Non- Training department. To get the full details of all employees, we need to concatenate both the data sets. To do so, we are using the SET statement.

Now, execute the above code in SAS studio:

Concatenate Multiple Data sets

Output:

Concatenate Multiple Data sets

As we can see in the output, both data sets have concatenated in a single table.

Different Cases

There are some cases when data sets have variations in the variables. In such type of different cases, the total number of observations in the concatenated data set is always equal to the sum of the observations in each data set. Let’s see the different cases when data sets variables have variations.

1. When Different Number of Variables

If all data sets do not have an equal number of variables, then data sets still get concatenated, but values of extra variables disappear in small data sets. Let’s understand through an example.

For Example:

There are two data sets to concatenate. One is training_department, which has five variables, i.e. empid, name, salary, address and course, and another is non_training_department, which has three variables, i.e. empid, name, and salary. In the output, values of address and course will disappear for data set non_training_department.

Now, execute the above code in SAS studio:

Concatenate Multiple Data sets

Output:

Concatenate Multiple Data sets

As you can see in the output, values of address, and course are missing for data set non_training_department.

2. When Different Variable Names

When all the data sets contain the same number of variables, but different names, in that case, we can concatenate datasets by applying Rename Statement. If we don’t use Rename statement then still SAS will concatenate data sets, but it will produce a missing result for the different name variables. We can apply the Rename Statement with the data set which we create for concatenation. Let’s understand through an example.

For Example:

In the below example, we have two data sets one is Training_Dept, and another is Non_Training_Dept. Both data sets have a variable that refers to the same value, i.e. name, but it declared by different names in both data sets. In data set Training_Dept the variable is declared by name whereas in the data set Non_Training_Dept by ename. To concatenate them, we are applying the RENAME statement on the concatenated data set All_Dept.

Now, execute the above code in SAS studio:

Concatenate Multiple Data sets

Output:

Concatenate Multiple Data sets

As we can see in the output, the employee name is concatenated by variable Employee.

3. When the length of the variable is different

If the length of variables in the data sets is different, then we can concatenate them by applying the Length statement. We should consider a higher length when applying the Length statement in the concatenated data set rather than smaller because SAS will generate higher length container that can easily accept small length values.

For Example:

In the below example, the variable name has length 5 in the Training_Dept data set and 7 in the Non_Training_Dept data set. When concatenating, we are applying a higher length that is 8.

Execute the above code in SAS studio:

Concatenate Multiple Data sets

Output:

Concatenate Multiple Data sets

As we can see in the output, the variable name is concatenated by higher character length, i.e. 8


You may also like