R 简明教程

R - Multiple Regression

多元回归是线性回归在两个以上变量之间的关系方面的扩展。在简单的线性关系中,我们有一个预测变量和一个响应变量,而在多元回归中,我们有多个预测变量和一个响应变量。

Multiple regression is an extension of linear regression into relationship between more than two variables. In simple linear relation we have one predictor and one response variable, but in multiple regression we have more than one predictor variable and one response variable.

多元回归的一般数学方程式为:

The general mathematical equation for multiple regression is −

y = a + b1x1 + b2x2 +...bnxn

以下是所用参数的描述 -

Following is the description of the parameters used −

  1. y is the response variable.

  2. a, b1, b2…​bn are the coefficients.

  3. x1, x2, …​xn are the predictor variables.

我们在 R 中使用 lm() 函数创建回归模型。该模型使用输入数据决定系数的值。接下来,我们可以使用这些系数来预测给定的预测变量集合的响应变量的值。

We create the regression model using the lm() function in R. The model determines the value of the coefficients using the input data. Next we can predict the value of the response variable for a given set of predictor variables using these coefficients.

lm() Function

此函数创建预测变量和响应变量之间的关系模型。

This function creates the relationship model between the predictor and the response variable.

Syntax

lm() 函数在多元回归中的基本语法是 −

The basic syntax for lm() function in multiple regression is −

lm(y ~ x1+x2+x3...,data)

以下是所用参数的描述 -

Following is the description of the parameters used −

  1. formula is a symbol presenting the relation between the response variable and predictor variables.

  2. data is the vector on which the formula will be applied.

Example

Input Data

考虑 R 环境中可用的数据组 "mtcars"。它以每加仑英里 (mpg)、气缸位移 ("disp")、马力 ("hp")、汽车重量 ("wt") 和更多参数对不同的汽车型号进行比较。

Consider the data set "mtcars" available in the R environment. It gives a comparison between different car models in terms of mileage per gallon (mpg), cylinder displacement("disp"), horse power("hp"), weight of the car("wt") and some more parameters.

该模型的目标是建立 "mpg" 作为响应变量与 "disp"、"hp" 和 "wt" 作为预测变量之间的关系。我们为此目的从 mtcars 数据集中创建这些变量的子集。

The goal of the model is to establish the relationship between "mpg" as a response variable with "disp","hp" and "wt" as predictor variables. We create a subset of these variables from the mtcars data set for this purpose.

input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

                   mpg   disp   hp    wt
Mazda RX4          21.0  160    110   2.620
Mazda RX4 Wag      21.0  160    110   2.875
Datsun 710         22.8  108     93   2.320
Hornet 4 Drive     21.4  258    110   3.215
Hornet Sportabout  18.7  360    175   3.440
Valiant            18.1  225    105   3.460

Create Relationship Model & get the Coefficients

input <- mtcars[,c("mpg","disp","hp","wt")]

# Create the relationship model.
model <- lm(mpg~disp+hp+wt, data = input)

# Show the model.
print(model)

# Get the Intercept and coefficients as vector elements.
cat("# # # # The Coefficient Values # # # ","\n")

a <- coef(model)[1]
print(a)

Xdisp <- coef(model)[2]
Xhp <- coef(model)[3]
Xwt <- coef(model)[4]

print(Xdisp)
print(Xhp)
print(Xwt)

当我们执行上述代码时,会产生以下结果 -

When we execute the above code, it produces the following result −

Call:
lm(formula = mpg ~ disp + hp + wt, data = input)

Coefficients:
(Intercept)         disp           hp           wt
  37.105505      -0.000937        -0.031157    -3.800891

# # # # The Coefficient Values # # #
(Intercept)
   37.10551
         disp
-0.0009370091
         hp
-0.03115655
       wt
-3.800891

Create Equation for Regression Model

基于上述截距和系数值,我们创建数学方程。

Based on the above intercept and coefficient values, we create the mathematical equation.

Y = a+Xdisp.x1+Xhp.x2+Xwt.x3
or
Y = 37.15+(-0.000937)*x1+(-0.0311)*x2+(-3.8008)*x3

Apply Equation for predicting New Values

我们可以使用上面创建的回归方程来预测在提供一组新的位移、马力和重量值时行驶里程。

We can use the regression equation created above to predict the mileage when a new set of values for displacement, horse power and weight is provided.

对于一辆位移为 221、马力为 102 且重量为 2.91 的汽车,预测的行驶里程为 −

For a car with disp = 221, hp = 102 and wt = 2.91 the predicted mileage is −

Y = 37.15+(-0.000937)*221+(-0.0311)*102+(-3.8008)*2.91 = 22.7104