Home » Read Raw Data in SAS

Read Raw Data in SAS

by Online Tutorials Library

Read Raw Data in SAS

In the last topic, we have learned how to merge datasets in SAS programming. Now, we are going to learn reading abilities of SAS, and also how to Read Raw Data from various kinds of files. We will also learn to enter some special types of data values.

As we discussed earlier, a raw data file is a file that is temporarily stored by SAS for the execution of a program. In SAS, we can read raw data from many types of files like text, excel, CSV hierarchal, etc.

Reading Abilities of SAS

1. Reading Space Separated Values

The data values, which are separated by the space, are often called list. One or more spaces separate each value. If there are missing values, they should be pointed by the placeholder, such as dot (.) or full stop or period.

Note that a dot (.) is used to indicate a missing value for either numeric or character variables.

For Example:

Read Raw Data in SAS

Output:

Read Raw Data in SAS

As you can see in the output, each value separated by space is creating a list. We provided dot (.) on the place of missing values, and SAS is also reading it for missing values.

2. How to Enter Instream Data

We can provide reading data to SAS in two ways, one when the data is too large, so read it from an external file instead of typing, but when the data is too small, then it is convenient to type the data in the SAS program instead of reading it from an external file. It is known as instream data. This is a quick way to enter data in SAS.

To enter this type of data, you will need 4 basic types of statements:

  • Data
  • Input
  • Cards or data lines
  • A semicolon on a line

It is mandatory to have at least one space between data values, but we can also give more than one space. It is essential for each variable to have a placeholder, even if the value is missing. A period (.) works to indicate a missing value for both character and numerical variables entered in this way. There is no need to align the data exactly in the columns.

For Example:

Output:

Read Raw Data in SAS

You can see in the output, data have entered in a tabular form.

3. Enter data for Several Cases on The Same Line

In SAS, we can use @@ to enter raw data on the same line for several cases.

For Example:

When you execute this code in SAS studio, will get the following output which shows that data have been entered:

Read Raw Data in SAS

Output:

Read Raw Data in SAS

We can see in the output, the data have been entered in several cases.

Reading data from External Files

In SAS, when the data is too large, so read it from an external file instead of typing. We can read data from many different sources such as exported from a database program, from a spreadsheet program or excel etc.

To do this, first, make sure that you should know the characteristics of the raw data file. You can read and check raw data by using text editors or word processing programs.

For small size files, you can use Windows Notepad, and for large size files, you can use Microsoft Word or Word Perfect. But make sure that if you open your raw data file with the word processing program, you can open the file that saved as text only.

You will need a codebook to read a raw data file. The codebook gives information about the contained data of the file. Some usually used raw data file types are:

  1. Space separated values: It contains data in the list form.
  2. Comma-separated values: It typically comes from Excel with file extension .csv.
  3. Tab-separated values: It is a kind of text file (.txt files) and come from a number of different applications, including Excel.
  4. Fixed-column data: It is a kind of form which contains informative data.

For reading raw data from a file, the data step must include the following 3 essential statements:

  1. Data
  2. Infile
  3. Input

We can add other statements in the data step to create new variables, recode variables and carry out data transformations.

Syntax to Read File

SAS data step is very simple to read raw data in SAS. The DATA statement gives the name to create the data set, and the infile statement indicates to read the raw data file.

For Example:

You can see in the example, the test is a dataset created by DATA statement, and INFILE is used to read the raw data file, i.e., in. Input statement lists variables to read in the same sequence of the raw data file.

How to Control Reading Process?

In SAS, we can control data reading by using the loop. You cannot skip any variable at the beginning of the variable list, but you can stop reading the variables before reaching the end of the list.

For Example:

As you can see in the example, you can stop reading the index when it will cross 30.

1. Reading from Text File (.txt files)

A file that contains data in the text format is called a text file. This type of files is generated by saving data with .txt extension. The data of these files is delimited by a space, but there can be various types of delimiters that are also handled by SAS. Let’s understand through an example, how SAS reads the text file by using infile statement.

For Example:

Output:

Read Raw Data in SAS

2. Reading from commas separated Values (.csv files)

The raw data values, which are separated by the comma or pipeline, are called CSV (Comma Separated Values). To read this type of files, use delimiter or short form dlm in the infile statement.

For Example:

Where,

  • delimiter = “,” or dlm=”,” tells that commas are used in the raw data file to separate the values.
  • firstobs: It tells the line number (firstobs=2) where from SAS can begin reading the raw data file. This is the line where the actual values begin.
  • dsd: It indicates SAS to read consecutive commas as missing values.

Output:

Read Raw Data in SAS

3. Reading from Excel File (.xls)

A file that contains data in the excel format is called excel file. These type of files are generated by saving data with .xls extension. Consider the following example:

The above code is used to read data from excel file and gives data values in tabular form.

Read Raw Data in SAS

4. Reading from Hierarchical Files

In the hierarchical files, the data is represented in a hierarchical format. These type of files contain observations; the number of records can vary observation to observation. Below is an example of a hierarchical file.

In the below file the details of each student under each branch are listed. The name of the branch will be considered as variable or column and record as observations or row. For the reading of the code, we use the below code in which we can identify variable record with an IF and using the loop to get the observations.

For Example:

The above code is used to read data from excel file and gives data values in tabular form.


You may also like