Sas 简明教程

SAS - Program Structure

SAS 编程涉及首先将数据集创建/读取到内存中,然后对该数据执行分析。我们需要了解按哪种流程编写程序才能实现这一点。

The SAS Programming involves first creating/reading the data sets into the memory and then doing the analysis on this data. We need to understand the flow in which a program is written to achieve this.

SAS Program Structure

下图显示了按给定顺序编写 SAS 程序所要执行的步骤。

The below diagram shows the steps to be written in the given sequence to create a SAS Program.

ps flow 1

每个 SAS 程序都必须包含所有这些步骤才能完成读取输入数据、分析数据和给出分析输出。还需要在每一步的末尾添加 RUN 语句来完成该步骤的执行。

Every SAS program must have all these steps to complete reading the input data, analysing the data and giving the output of the analysis. Also the RUN statement at the end of each step is required to complete the execution of that step.

DATA Step

此步骤涉及将所需数据集加载到 SAS 内存并确定数据集的变量(也称为列)。它还捕获记录(也称为观察值或受试者)。DATA 语句的语法如下。

This step involves loading the required data set into SAS memory and identifying the variables (also called columns) of the data set. It also captures the records (also called observations or subjects). The syntax for DATA statement is as below.

Syntax

DATA data_set_name;		#Name the data set.
INPUT var1,var2,var3; 		#Define the variables in this data set.
NEW_VAR;			#Create new variables.
LABEL;			      	#Assign labels to variables.
DATALINES;		      	#Enter the data.
RUN;

Example

以下示例显示了对数据集命名、定义变量、创建新变量和输入数据的简单情况。此处,字符串变量的末尾都有一个 $,而数字值则没有。

The below example shows a simple case of naming the data set, defining the variables, creating new variables and entering the data. Here the string variables have a $ at the end and numeric values are without it.

DATA TEMP;
INPUT ID $ NAME $ SALARY DEPARTMENT $;
comm = SALARY*0.25;
LABEL ID = 'Employee ID' comm = 'COMMISION';
DATALINES;
1 Rick 623.3 IT
2 Dan 515.2 Operations
3 Michelle 611 IT
4 Ryan 729 HR
5 Gary 843.25 Finance
6 Nina 578 IT
7 Simon 632.8 Operations
8 Guru 722.5 Finance
;
RUN;

PROC Step

此步骤涉及调用 SAS 内置过程来分析数据。

This step involves invoking a SAS built-in procedure to analyse the data.

Syntax

PROC procedure_name options; #The name of the proc.
RUN;

Example

以下示例显示使用 MEANS 过程打印数据集中数字变量的平均值。

The below example shows using the MEANS procedure to print the mean values of the numeric variables in the data set.

PROC MEANS;
RUN;

The OUTPUT Step

可以使用条件输出语句显示数据集的数据。

The data from the data sets can be displayed with conditional output statements.

Syntax

PROC PRINT DATA = data_set;
OPTIONS;
RUN;

Example

以下示例显示在输出中使用 where 子句仅生成数据集中的一部分记录。

The below example shows using the where clause in the output to produce only few records from the data set.

PROC PRINT DATA = TEMP;
WHERE SALARY > 700;
RUN;

The complete SAS Program

以下是上述各个步骤的完整代码。

Below is the complete code for each of the above steps.

ps complete code

Program Output

ps program output