Sas 简明教程
SAS - Frequency Distributions
频数分布是一个表,其中显示了数据集中的数据点的频数。表中的每个条目均包含某个特定群组或区间中的值出现的频数或计数,并且以这种方式,该表总结了样本中的值分布。
A frequency distribution is a table showing the frequency of the data points in a data set. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample.
SAS 提供了一个名为 PROC FREQ 的步骤来计算数据集中的数据点的频数分布。
SAS provides a procedure called PROC FREQ to calculate the frequency distribution of data points in a data set.
Syntax
在 SAS 中计算频数分布的基本语法如下:
The basic syntax for calculating frequency distribution in SAS is −
PROC FREQ DATA = Dataset ;
TABLES Variable_1 ;
BY Variable_2 ;
以下是所用参数的描述 -
Following is the description of the parameters used −
-
Dataset is the name of the dataset.
-
Variables_1 is the variable names of the dataset whose frequency distribution needs to be calculated.
-
Variables_2 is the variables which categorised the frequency distribution result.
Single Variable Frequency Distribution
我们可以使用 PROC FREQ. 确定单个变量的频数分布。在这种情况下,结果将显示变量的每个值的频数。结果还显示百分比分布、累积频数和累积百分比。
We can determine the frequency distribution of a single variable by using PROC FREQ. In this case the result will show the frequency of each value of the variable. The result also shows the percentage distribution, cumulative frequency and cumulative percentage.
Example
在以下示例中,我们找到名为 CARS1 的数据集的马力变量的频数分布,该数据集是从库 SASHELP.CARS. 创建的。我们可以看到结果划分为两类。一类是汽车的每个品牌。
In the below example we find the frequency distribution of the variable horsepower for the dataset named CARS1 which is created form the library SASHELP.CARS. We can see the result divided into two categories of results. One for each make of the car.
PROC SQL;
create table CARS1 as
SELECT make, model, type, invoice, horsepower, length, weight
FROM
SASHELP.CARS
WHERE make in ('Audi','BMW')
;
RUN;
proc FREQ data = CARS1 ;
tables horsepower;
by make;
run;
在执行以上代码后,我们将得到以下结果:
When the above code is executed, we get the following result −
Multiple Variable Frequency Distribution
我们可以找到将它们分组为所有可能组合的多个变量的频数分布。
We can find the frequency distributions for multiple variables which groups them into all possible combinations.
Example
在以下示例中,我们计算汽车品牌的频数分布 grouped by car type 和每种汽车类型的频数分布 grouped by each make.
In the below example we calculate the frequency distribution for the make of a car for grouped by car type and also the frequency distribution of each type of car grouped by each make.
proc FREQ data = CARS1 ;
tables make type;
run;
在执行以上代码后,我们将得到以下结果:
When the above code is executed, we get the following result −
Frequency Distribution with Weight
使用 weight 选项,我们可以计算根据变量权重有偏差的频数分布。此处将变量的值视为观测数量,而不是值的计数。
With the weight option we can calculate the frequency distribution biased with the weight of the variable. Here the value of the variable is taken as the number of observations instead of the count of value.
Example
在以下示例中,我们计算变量品牌和类型在权重分配给马力的频数分布。
In the below example we calculate the frequency distribution of the variables make and type with weight assigned to horsepower.
proc FREQ data = CARS1 ;
tables make type;
weight horsepower;
run;
在执行以上代码后,我们将得到以下结果:
When the above code is executed, we get the following result −