Sas 简明教程

SAS - Correlation Analysis

相关分析处理变量之间的关系。相关系数是两个变量之间线性关系的度量。相关系数的值始终介于 -1 和 +1 之间。SAS 提供了过程 PROC CORR 在数据集的变量对之间查找相关系数。

Correlation analysis deals with relationships among variables. The correlation coefficient is a measure of linear association between two variables.Values of the correlation coefficient are always between -1 and +1. SAS provides the procedure PROC CORR to find the correlation coefficients between a pair of variables in a dataset.

Syntax

在 SAS 中应用 PROC CORR 的基本语法为:

The basic syntax for applying PROC CORR in SAS is −

PROC CORR DATA = dataset options;
VAR variable;

以下是所用参数的描述 -

Following is the description of the parameters used −

  1. Dataset is the name of the dataset.

  2. Options is the additional option with procedure like plotting a matrix etc.

  3. Variable is the variable name of the dataset used in finding the correlation.

Example

可以通过在 VAR 语句中使用名称来获取数据集中变量对之间的相关系数。在下面的示例中,我们使用数据集 CARS1 并获得显示马力和重量之间的相关系数的结果。

Correlation coefficients between a pair of variables available in a dataset can be obtained by use their names in the VAR statement.In the below example we use the dataset CARS1 and get the result showing the correlation coefficients between horsepower and weight.

PROC SQL;
create table CARS1 as
SELECT invoice, horsepower, length, weight
   FROM
   SASHELP.CARS
   WHERE make in ('Audi','BMW')
;
RUN;

proc corr data = cars1 ;
VAR horsepower weight ;
BY make;
run;

在执行以上代码后,我们将得到以下结果:

When the above code is executed, we get the following result −

corr ana 2

Correlation Between All Variables

可以通过简单地将该过程与数据集名称一起应用来获取数据集中所有可用变量之间的相关系数。

Correlation coefficients between all the variables available in a dataset can be obtained by simply applying the procedure with the dataset name.

Example

在下面的示例中,我们使用数据集 CARS1 并获得显示变量对之间各个相关系数的结果。

In the below example we use the dataset CARS1 and get the result showing the correlation coefficients between each pair of the variables.

proc corr data = cars1 ;
run;

在执行以上代码后,我们将得到以下结果:

When the above code is executed, we get the following result −

corr ana 1

Correlation Matrix

我们可以通过在 PROC 语句中选择绘图矩阵选项来获取变量之间的散点图矩阵。

We can obtain a scatterplot matrix between the variables by choosing the option to plot matrix in the PROC statement.

Example

在下面的示例中,我们获得了马力和重量之间的矩阵。

In below example we get the matrix between horsepower and weight.

proc corr data = cars1 plots = matrix ;
VAR horsepower weight ;
run;

在执行以上代码后,我们将得到以下结果:

When the above code is executed, we get the following result −

corr ana 3