Machine Learning 简明教程

Machine Learning - Precision and Recall

精确度和召回率是两个重要的指标，用于评估机器学习中分类模型的性能。它们对不平衡数据集特别有用，其中某一类的实例明显少于另一类。

Precision and recall are two important metrics used to evaluate the performance of classification models in machine learning. They are particularly useful for imbalanced datasets where one class has significantly fewer instances than the other.

精确度衡量分类器做出的正面预测有多少是正确的。它定义为真阳性 (TP) 与总正面预测数 (TP + FP) 的比率。换句话说，精确度衡量了所有正面预测中真阳性的比例。

Precision is a measure of how many of the positive predictions made by a classifier were correct. It is defined as the ratio of true positives (TP) to the total number of positive predictions (TP + FP). In other words, precision measures the proportion of true positives among all positive predictions.

精确度 = TP/ (TP + FP)

Precision=TP/\left ( TP+FP \right )

另一方面，召回率衡量的是分类器正确识别多少实际阳性实例。它定义为真阳性 (TP) 与总实际阳性实例数 (TP + FN) 的比率。换句话说，召回率衡量了所有实际阳性实例中真阳性的比例。

Recall, on the other hand, is a measure of how many of the actual positive instances were correctly identified by the classifier. It is defined as the ratio of true positives (TP) to the total number of actual positive instances (TP + FN). In other words, recall measures the proportion of true positives among all actual positive instances.

召回率 = TP/ (TP + FN)

Recall=TP/\left ( TP+FN \right )

要了解精确度和召回率，请考虑检测垃圾邮件的问题。分类器可以将电子邮件标记为垃圾邮件 (正面预测) 或非垃圾邮件 (负面预测)。电子邮件的实际标签可以是垃圾邮件或非垃圾邮件。如果电子邮件实际上是垃圾邮件，并且分类器正确地将其标记为垃圾邮件，那么它就是真阳性。如果电子邮件不是垃圾邮件，但分类器错误地将其标记为垃圾邮件，那么它就是假阳性。如果电子邮件实际上是垃圾邮件，但分类器错误地将其标记为非垃圾邮件，那么它就是假阴性。最后，如果电子邮件不是垃圾邮件，并且分类器正确地将其标记为非垃圾邮件，那么它就是真阴性。

To understand precision and recall, consider the problem of detecting spam emails. A classifier may label an email as spam (positive prediction) or not spam (negative prediction). The actual label of the email can be either spam or not spam. If the email is actually spam and the classifier correctly labels it as spam, then it is a true positive. If the email is not spam but the classifier incorrectly labels it as spam, then it is a false positive. If the email is actually spam but the classifier incorrectly labels it as not spam, then it is a false negative. Finally, if the email is not spam and the classifier correctly labels it as not spam, then it is a true negative.

在这种情况下，精确度衡量了分类器正确识别为垃圾邮件的垃圾邮件比例。高精确度表明分类器正确识别了大部分垃圾邮件，并且没有将许多合法电子邮件标记为垃圾邮件。另一方面，召回率衡量了分类器正确识别出的所有垃圾邮件的比例。高召回率表明分类器正确识别了大部分垃圾邮件，即使它将一些合法电子邮件标记为垃圾邮件。

In this scenario, precision measures the proportion of spam emails that were correctly identified as spam by the classifier. A high precision indicates that the classifier is correctly identifying most of the spam emails and is not labeling many legitimate emails as spam. On the other hand, recall measures the proportion of all spam emails that were correctly identified by the classifier. A high recall indicates that the classifier is correctly identifying most of the spam emails, even if it is labeling some legitimate emails as spam.

Implementation in Python

在 scikit-learn 中，可以使用 precision_score() 和 recall_score() 函数分别计算精确度和召回率。这些函数以一组实例的真实标签和预测标签作为输入，并返回相应的精确度和召回率得分。

In scikit-learn, precision and recall can be calculated using the precision_score() and recall_score() functions, respectively. These functions take as input the true labels and predicted labels for a set of instances, and return the corresponding precision and recall scores.

例如，考虑以下代码段，它使用 scikit-learn 中的乳腺癌数据集来训练逻辑回归分类器并评估其精确度和召回率得分 −

For example, consider the following code snippet that uses the breast cancer dataset from scikit-learn to train a logistic regression classifier and evaluate its precision and recall scores −

Example

from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score

# Load the breast cancer dataset
data = load_breast_cancer()

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

# Train a logistic regression classifier
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)

# Make predictions on the testing set
y_pred = clf.predict(X_test)

# Calculate precision and recall scores
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
print("Precision:", precision)
print("Recall:", recall)

在上面的示例中，我们首先加载乳腺癌数据集并将其分成训练集和测试集。然后，我们在训练集上训练逻辑回归分类器，并使用 predict() 方法对测试集进行预测。最后，我们使用 precision_score() 和 recall_score() 函数计算精确度和召回率得分。

In the above example, we first load the breast cancer dataset and split it into training and testing sets. We then train a logistic regression classifier on the training set and make predictions on the testing set using the predict() method. Finally, we calculate the precision and recall scores using the precision_score() and recall_score() functions.

执行此代码时，将生成以下输出 −

When you execute this code, it will produce the following output −

Precision: 0.9459459459459459
Recall: 0.9859154929577465