Sas 简明教程

SAS - Read Raw Data

SAS 可以从各种来源读取数据,其中包括许多文件格式。下面讨论在 SAS 环境中使用的文件格式。

SAS can read data from various sources which includes many file formats. The file formats used in SAS environment is discussed below.

  1. ASCII(Text) Data Set

  2. Delimited Data

  3. Excel Data

  4. Hierarchical Data

Reading ASCII(Text) Data Set

这些文件包含文本格式的数据。数据通常由空格分隔,但 SAS 还可以处理不同类型的分隔符。让我们考虑一个包含员工数据的 ASCII 文件。我们使用 SAS 中 Infile 语句读取这个文件。

These are the files which contain the data on text format. The data is usually delimited by a space, but there can be different types of delimiters also which SAS can handle. Let’s consider an ASCII file containing the employee data. We read this file using the Infile statement available in SAS.

Example

在下面的示例中,我们从本地环境读取名为 emp_data.txt 的数据文件。

In the below example we read the data file named emp_data.txt from the local environment.

data TEMP;
   infile
   '/folders/myfolders/sasuser.v94/TutorialsPoint/emp_data.txt';
   input empID empName $ Salary Dept $ DOJ date9. ;
   format DOJ date9.;
   run;
   PROC PRINT DATA = TEMP;
RUN;

当以上代码执行时,我们会得到以下输出:

When the above code is executed, we get the following output.

read raw data1

Reading Delimited Data

这些数据文件中的列值由逗号或管道等分隔符分隔。在这种情况下,我们在 infile 语句中使用 dlm 选项。

These are the data files in which the column values are separated by a delimiting character like a comma or pipeline etc. In this case we use the dlm option in the infile statement.

Example

在下面的示例中,我们从本地环境读取名为 emp.csv 的数据文件。

In the below example we read the data file named emp.csv from the local environment.

data TEMP;
   infile
   '/folders/myfolders/sasuser.v94/TutorialsPoint/emp.csv' dlm=",";
   input empID empName $ Salary Dept $ DOJ date9. ;
   format DOJ date9.;
   run;
   PROC PRINT DATA = TEMP;
RUN;

当以上代码执行时,我们会得到以下输出:

When the above code is executed, we get the following output.

read raw data1

Reading Excel Data

SAS 可以使用导入工具直接读取 Excel 文件。如在 SAS 数据集章节中看到的那样,它可以处理各种文件类型,包括 MS Excel。假设文件 emp.xls 在 SAS 环境中的本地可用。

SAS can directly read an excel file using the import facility. As seen in the chapter SAS data sets, it can handle a wide variety of file types including MS excel. Assuming the file emp.xls is available locally in the SAS environment.

Example

FILENAME REFFILE
"/folders/myfolders/TutorialsPoint/emp.xls"
TERMSTR = CR;

PROC IMPORT DATAFILE = REFFILE
DBMS = XLS
OUT = WORK.IMPORT;
GETNAMES = YES;
RUN;
PROC PRINT DATA = WORK.IMPORT RUN;

以上的代码从 Excel 文件中读取数据,并给出与以上两个文件类型相同输出。

The above code reads the data from excel file and gives the same output as above two file types.

Reading Hierarchical Files

在这些文件中,数据以层次格式呈现。对于给定的观测值,有一个标题记录,在它下面提到了许多详细记录。详细记录的数量可以从一个观测值到另一个观测值而有所不同。下面是层次文件的说明。

In these files the data is present in hierarchical format. For a given observation there is a header record below which many detail records are mentioned. The number of details records can vary from one observation to another. Below is an illustration of a hierarchical file.

在下面的文件中,列出了每个部门下每个员工的详细信息。第一条记录是标题记录,提到了部门,下一条记录以 DTLS 开头的几条记录是详细记录。

In the below file the details of each employee under each department is listed. The first record is the header record mentioning the department and the next record few records starting with DTLS are the details record.

DEPT:IT
DTLS:1:Rick:623
DTLS:3:Mike:611
DTLS:6:Tusar:578
DEPT:OPS
DTLS:7:Pranab:632
DTLS:2:Dan:452
DEPT:HR
DTLS:4:Ryan:487
DTLS:2:Siyona:452

Example

为了读取层次文件,我们在下面的代码中使用了 IF 子句来识别标题记录,并使用 DO 循环处理详细记录。

To read the hierarchical file we use the below code in which we identify the header record with an IF clause and use a do loop to process the details record.

data employees(drop = Type);
   length Type $ 3  Department
      empID $ 3 empName $ 10 Empsal 3 ;
   retain Department;
   infile
   '/folders/myfolders/TutorialsPoint/empdtls.txt' dlm = ':';
   input Type $ @;
   if Type = 'DEP' then
      input Department $;
   else do;
      input empID  empName $ Empsal ;
      output;
   end;
run;

   PROC PRINT DATA = employees;
RUN;

当以上代码执行时,我们会得到以下输出:

When the above code is executed, we get the following output.

read heirarchial data2