R 简明教程
R - Multiple Regression
多元回归是线性回归在两个以上变量之间的关系方面的扩展。在简单的线性关系中,我们有一个预测变量和一个响应变量,而在多元回归中,我们有多个预测变量和一个响应变量。
Multiple regression is an extension of linear regression into relationship between more than two variables. In simple linear relation we have one predictor and one response variable, but in multiple regression we have more than one predictor variable and one response variable.
多元回归的一般数学方程式为:
The general mathematical equation for multiple regression is −
y = a + b1x1 + b2x2 +...bnxn
以下是所用参数的描述 -
Following is the description of the parameters used −
-
y is the response variable.
-
a, b1, b2…bn are the coefficients.
-
x1, x2, …xn are the predictor variables.
我们在 R 中使用 lm() 函数创建回归模型。该模型使用输入数据决定系数的值。接下来,我们可以使用这些系数来预测给定的预测变量集合的响应变量的值。
We create the regression model using the lm() function in R. The model determines the value of the coefficients using the input data. Next we can predict the value of the response variable for a given set of predictor variables using these coefficients.
lm() Function
此函数创建预测变量和响应变量之间的关系模型。
This function creates the relationship model between the predictor and the response variable.
Syntax
lm() 函数在多元回归中的基本语法是 −
The basic syntax for lm() function in multiple regression is −
lm(y ~ x1+x2+x3...,data)
以下是所用参数的描述 -
Following is the description of the parameters used −
-
formula is a symbol presenting the relation between the response variable and predictor variables.
-
data is the vector on which the formula will be applied.
Example
Input Data
考虑 R 环境中可用的数据组 "mtcars"。它以每加仑英里 (mpg)、气缸位移 ("disp")、马力 ("hp")、汽车重量 ("wt") 和更多参数对不同的汽车型号进行比较。
Consider the data set "mtcars" available in the R environment. It gives a comparison between different car models in terms of mileage per gallon (mpg), cylinder displacement("disp"), horse power("hp"), weight of the car("wt") and some more parameters.
该模型的目标是建立 "mpg" 作为响应变量与 "disp"、"hp" 和 "wt" 作为预测变量之间的关系。我们为此目的从 mtcars 数据集中创建这些变量的子集。
The goal of the model is to establish the relationship between "mpg" as a response variable with "disp","hp" and "wt" as predictor variables. We create a subset of these variables from the mtcars data set for this purpose.
input <- mtcars[,c("mpg","disp","hp","wt")]
print(head(input))
当我们执行上述代码时,会产生以下结果 -
When we execute the above code, it produces the following result −
mpg disp hp wt
Mazda RX4 21.0 160 110 2.620
Mazda RX4 Wag 21.0 160 110 2.875
Datsun 710 22.8 108 93 2.320
Hornet 4 Drive 21.4 258 110 3.215
Hornet Sportabout 18.7 360 175 3.440
Valiant 18.1 225 105 3.460
Create Relationship Model & get the Coefficients
input <- mtcars[,c("mpg","disp","hp","wt")]
# Create the relationship model.
model <- lm(mpg~disp+hp+wt, data = input)
# Show the model.
print(model)
# Get the Intercept and coefficients as vector elements.
cat("# # # # The Coefficient Values # # # ","\n")
a <- coef(model)[1]
print(a)
Xdisp <- coef(model)[2]
Xhp <- coef(model)[3]
Xwt <- coef(model)[4]
print(Xdisp)
print(Xhp)
print(Xwt)
当我们执行上述代码时,会产生以下结果 -
When we execute the above code, it produces the following result −
Call:
lm(formula = mpg ~ disp + hp + wt, data = input)
Coefficients:
(Intercept) disp hp wt
37.105505 -0.000937 -0.031157 -3.800891
# # # # The Coefficient Values # # #
(Intercept)
37.10551
disp
-0.0009370091
hp
-0.03115655
wt
-3.800891
Create Equation for Regression Model
基于上述截距和系数值,我们创建数学方程。
Based on the above intercept and coefficient values, we create the mathematical equation.
Y = a+Xdisp.x1+Xhp.x2+Xwt.x3
or
Y = 37.15+(-0.000937)*x1+(-0.0311)*x2+(-3.8008)*x3
Apply Equation for predicting New Values
我们可以使用上面创建的回归方程来预测在提供一组新的位移、马力和重量值时行驶里程。
We can use the regression equation created above to predict the mileage when a new set of values for displacement, horse power and weight is provided.
对于一辆位移为 221、马力为 102 且重量为 2.91 的汽车,预测的行驶里程为 −
For a car with disp = 221, hp = 102 and wt = 2.91 the predicted mileage is −
Y = 37.15+(-0.000937)*221+(-0.0311)*102+(-3.8008)*2.91 = 22.7104