Sas 简明教程
SAS - Standard Deviation
标准差 (SD) 是衡量数据集中数据差异程度的指标。从数学上讲,它衡量了每个值与数据集的平均值距离有多远或有多近。接近 0 的标准差值表示数据点趋于非常接近数据集的平均值,而较高的标准差表示数据点分布在较宽的值范围内
Standard deviation (SD) is a measure of how varied is the data in a data set. Mathematically it measures how distant or close are each value to the mean value of a data set. A standard deviation value close to 0 indicates that the data points tend to be very close to the mean of the data set and a high standard deviation indicates that the data points are spread out over a wider range of values
在 SAS 中,SD 值使用 PROC MEAN 和 PROC SURVEYMEANS 测量。
In SAS the SD values is measured using PROC MEAN as well as PROC SURVEYMEANS.
Using PROC MEANS
要使用 proc means 测量 SD,我们在 PROC 步骤中选择 STD 选项。它会显示数据集中存在的每个数值变量的 SD 值。
To measure the SD using proc means we choose the STD option in the PROC step. It brings out the SD values for each numeric variable present in the data set.
Syntax
在 SAS 中计算标准差的基本语法是:
The basic syntax for calculating standard deviation in SAS is −
PROC means DATA = dataset STD;
以下是所用参数的描述 -
Following is the description of the parameters used −
-
Dataset − is the name of the dataset.
Example
在下面的示例中,我们从 SASHELP 库中的 CARS 数据集创建数据集 CARS1。我们使用 PROC 均值步骤选择 STD 选项。
In the below example we create the data set CARS1 form the CARS data set in the SASHELP library. We choose the STD option with the PROC means step.
PROC SQL;
create table CARS1 as
SELECT make, type, invoice, horsepower, length, weight
FROM
SASHELP.CARS
WHERE make in ('Audi','BMW')
;
RUN;
proc means data = CARS1 STD;
run;
当我们执行以上代码时,会给出以下输出:
When we execute the above code it gives the following output −
Using PROC SURVEYMEANS
此过程还用于测量 SD 以及一些高级功能,例如测量分类变量的 SD 以及提供方差估计。
This procedure is also used for measurement of SD along with some advance features like measuring SD for categorical variables as well as provide estimates in variance.
Syntax
使用 PROC SURVEYMEANS 的语法是:
The syntax for using PROC SURVEYMEANS is −
PROC SURVEYMEANS options statistic-keywords ;
BY variables ;
CLASS variables ;
VAR variables ;
以下是所用参数的描述 -
Following is the description of the parameters used −
-
BY − indicates the variables used to create groups of observations.
-
CLASS − indicates the variables used for categorical variables.
-
VAR − indicates the variables for which SD will be calculated.
Example
以下示例描述了 class 选项的使用情况,该选项会为分类变量中的每个值创建统计信息。
The below example describes the use of class option which creates the statistics for each of the values in the class variable.
proc surveymeans data = CARS1 STD;
class type;
var type horsepower;
ods output statistics = rectangle;
run;
proc print data = rectangle;
run;
当我们执行以上代码时,会给出以下输出:
When we execute the above code it gives the following output −
Using BY option
以下代码给出 BY 选项的示例。其中结果将针对 BY 选项中的每一个值进行分组。
The below code gives example of BY option. In it the result is grouped for each value in the BY option.