Logistic Regression In Python 简明教程
Logistic Regression in Python - Testing
我们需要在将上述创建的分类器投入生产使用之前对其进行测试。如果测试显示模型未达到所需精度,我们必须回到上述流程中,选择另一组特征(数据字段),重新构建模型并对其进行测试。这将是一个迭代步骤,直到分类器满足所需的精度要求。因此,让我们测试我们的分类器。
We need to test the above created classifier before we put it into production use. If the testing reveals that the model does not meet the desired accuracy, we will have to go back in the above process, select another set of features (data fields), build the model again, and test it. This will be an iterative step until the classifier meets your requirement of desired accuracy. So let us test our classifier.
Predicting Test Data
为了测试分类器,我们使用前面阶段中生成测试数据。我们对创建对象调用 predict 方法,并将测试数据 X 数组传递给下列命令:
To test the classifier, we use the test data generated in the earlier stage. We call the predict method on the created object and pass the X array of the test data as shown in the following command −
In [24]: predicted_y = classifier.predict(X_test)
这将为整个训练数据集生成一个一维数组,提供 X 数组中每一行预测。你可使用以下命令检查此数组:
This generates a single dimensional array for the entire training data set giving the prediction for each row in the X array. You can examine this array by using the following command −
In [25]: predicted_y
以下是执行以上两个命令的输出:
The following is the output upon the execution the above two commands −
Out[25]: array([0, 0, 0, ..., 0, 0, 0])
输出表明第一个和最后三个客户并不是 Term Deposit 的潜在候选人。你可以检查整个数组以找出潜在客户。为此,请使用以下 Python 代码段:
The output indicates that the first and last three customers are not the potential candidates for the Term Deposit. You can examine the entire array to sort out the potential customers. To do so, use the following Python code snippet −
In [26]: for x in range(len(predicted_y)):
if (predicted_y[x] == 1):
print(x, end="\t")
运行以上代码的输出显示如下:
The output of running the above code is shown below −
输出显示所有行的索引,这些行是订阅定期存款的潜在候选人。你现在可以将此输出交给银行的营销团队,由该团队获取所选行中每个客户的联系方式并继续其工作。
The output shows the indexes of all rows who are probable candidates for subscribing to TD. You can now give this output to the bank’s marketing team who would pick up the contact details for each customer in the selected row and proceed with their job.
在我们把这个模型投入生产之前,我们需要验证预测精度。
Before we put this model into production, we need to verify the accuracy of prediction.
Verifying Accuracy
要测试模型的精度,请在分类器上使用以下所示的得分方法:
To test the accuracy of the model, use the score method on the classifier as shown below −
In [27]: print('Accuracy: {:.2f}'.format(classifier.score(X_test, Y_test)))
运行此命令的屏幕输出如下所示:
The screen output of running this command is shown below −
Accuracy: 0.90
它显示我们模型的精度为 90%,在大多数应用程序中这被认为非常好。因此,不需要进一步调整。现在,我们的客户已准备好运行下一个活动,获得潜在客户列表,并追逐他们开设定期存款,成功率可能很高。
It shows that the accuracy of our model is 90% which is considered very good in most of the applications. Thus, no further tuning is required. Now, our customer is ready to run the next campaign, get the list of potential customers and chase them for opening the TD with a probable high rate of success.