Microsoft Cognitive Toolkit 简明教程
CNTK - Logistic Regression Model
本章讨论如何在CNTK中构建逻辑回归模型。
This chapter deals with constructing a logistic regression model in CNTK.
Basics of Logistic Regression model
逻辑回归是最简单的机器学习技术之一,特别用于二分类技术。换而言之,在变量值可以是两个分类值之一的情况下创建预测模型。逻辑回归最简单的例子之一是根据一个人的年龄、声音、头发等预测一个人是男性还是女性。
Logistic Regression, one of the simplest ML techniques, is a technique especially for binary classification. In other words, to create a prediction model in situations where the value of the variable to predict can be one of just two categorical values. One of the simplest examples of Logistic Regression is to predict whether the person is male or female, based on person’s age, voice, hairs and so on.
Example
让我们借助另一个示例从数学角度了解逻辑回归的概念-
Let’s understand the concept of Logistic Regression mathematically with the help of another example −
假设我们想根据申请人 debt , income 和 credit rating 预测贷款申请的信用价值;0 表示拒绝,1 表示批准。我们使用 X1 表示债务,使用 X2 表示收入,使用 X3 表示信用评级。
Suppose, we want to predict the credit worthiness of a loan application; 0 means reject, and 1 means approve, based on applicant debt , income and credit rating. We represent debt with X1, income with X2 and credit rating with X3.
在逻辑回归中,我们为每个特征确定一个重量值(由 w 表示),并为每个特征确定一个单个偏差值(由 b 表示)。
In Logistic Regression, we determine a weight value, represented by w, for every feature and a single bias value, represented by b.
现在假设
Now suppose,
X1 = 3.0
X2 = -2.0
X3 = 1.0
现在假设我们如下确定重量和偏差(bias)
And suppose we determine weight and bias as follows −
W1 = 0.65, W2 = 1.75, W3 = 2.05 and b = 0.33
现在,对于预测类别,我们需要应用以下公式
Now, for predicting the class, we need to apply the following formula −
Z = (X1*W1)+(X2*W2)+(X3+W3)+b
i.e. Z = (3.0)*(0.65) + (-2.0)*(1.75) + (1.0)*(2.05) + 0.33
= 0.83
接下来,我们需要计算 P = 1.0/(1.0 + exp(-Z)) 。这里,exp() 函数是欧拉数。
Next, we need to compute P = 1.0/(1.0 + exp(-Z)). Here, the exp() function is Euler’s number.
P = 1.0/(1.0 + exp(-0.83)
= 0.6963
P 值可以解释为类别为 1 的概率。如果 P < 0.5,则预测为类别 = 0,否则预测(P >= 0.5)为类别 = 1。
The P value can be interpreted as the probability that the class is 1. If P < 0.5, the prediction is class = 0 else the prediction (P >= 0.5) is class = 1.
要确定重量和偏差的值,我们必须获取一组训练数据,其中包含已知的输入预测变量值和已知的正确类别标签值。之后,我们可以使用一个算法(通常为梯度下降)来找到重量和偏差的值。
To determine the values of weight and bias, we must obtain a set of training data having the known input predictor values and known correct class labels values. After that, we can use an algorithm, generally Gradient Descent, in order to find the values of weight and bias.
LR model implementation example
对于此 LR 模型,我们将使用以下数据集
For this LR model, we are going to use the following data set −
1.0, 2.0, 0
3.0, 4.0, 0
5.0, 2.0, 0
6.0, 3.0, 0
8.0, 1.0, 0
9.0, 2.0, 0
1.0, 4.0, 1
2.0, 5.0, 1
4.0, 6.0, 1
6.0, 5.0, 1
7.0, 3.0, 1
8.0, 5.0, 1
要在 CNTK 中启动该 LR 模型实现,我们需要首先导入以下包
To start this LR model implementation in CNTK, we need to first import the following packages −
import numpy as np
import cntk as C
程序的结构采用 main() 函数,如下所示
The program is structured with main() function as follows −
def main():
print("Using CNTK version = " + str(C.__version__) + "\n")
现在,我们需要按照如下方式将训练数据加载到内存中
Now, we need to load the training data into memory as follows −
data_file = ".\\dataLRmodel.txt"
print("Loading data from " + data_file + "\n")
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1])
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)
现在,我们将创建一个训练程序,该程序将创建一个适合训练数据的逻辑回归模型
Now, we will be creating a training program that creates a logistic regression model which is compatible with the training data −
features_dim = 2
labels_dim = 1
X = C.ops.input_variable(features_dim, np.float32)
y = C.input_variable(labels_dim, np.float32)
W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter
b = C.parameter(shape=(labels_dim))
z = C.times(X, W) + b
p = 1.0 / (1.0 + C.exp(-z))
model = p
现在,我们需要按照如下方式创建 Lerner 和培训人员
Now, we need to create Lerner and trainer as follows −
ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR
fixed_lr = 0.010
learner = C.sgd(model.parameters, fixed_lr)
trainer = C.Trainer(model, (ce_error), [learner])
max_iterations = 4000
LR Model training
一旦我们创建了 LR 模型,接下来,就该开始训练过程了
Once, we have created the LR model, next, it is time to start the training process −
np.random.seed(4)
N = len(features_mat)
for i in range(0, max_iterations):
row = np.random.choice(N,1) # pick a random row from training items
trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] })
if i % 1000 == 0 and i > 0:
mcee = trainer.previous_minibatch_loss_average
print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee)
现在,借助以下代码,我们可以打印模型重量和偏差
Now, with the help of the following code, we can print the model weights and bias −
np.set_printoptions(precision=4, suppress=True)
print("Model weights: ")
print(W.value)
print("Model bias:")
print(b.value)
print("")
if __name__ == "__main__":
main()
Training a Logistic Regression model - Complete example
import numpy as np
import cntk as C
def main():
print("Using CNTK version = " + str(C.__version__) + "\n")
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
print("Loading data from " + data_file + "\n")
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1])
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)
features_dim = 2
labels_dim = 1
X = C.ops.input_variable(features_dim, np.float32)
y = C.input_variable(labels_dim, np.float32)
W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter
b = C.parameter(shape=(labels_dim))
z = C.times(X, W) + b
p = 1.0 / (1.0 + C.exp(-z))
model = p
ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR
fixed_lr = 0.010
learner = C.sgd(model.parameters, fixed_lr)
trainer = C.Trainer(model, (ce_error), [learner])
max_iterations = 4000
np.random.seed(4)
N = len(features_mat)
for i in range(0, max_iterations):
row = np.random.choice(N,1) # pick a random row from training items
trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] })
if i % 1000 == 0 and i > 0:
mcee = trainer.previous_minibatch_loss_average
print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee)
np.set_printoptions(precision=4, suppress=True)
print("Model weights: ")
print(W.value)
print("Model bias:")
print(b.value)
if __name__ == "__main__":
main()
Prediction using trained LR Model
一旦训练了 LR 模型,我们就可以按照如下方式使用它进行预测
Once the LR model has been trained, we can use it for prediction as follows −
首先,我们的评估程序导入 numpy 包,并将训练数据加载到特征矩阵和类别标签矩阵中,方式与我们上面实现的训练程序相同
First of all, our evaluation program imports the numpy package and loads the training data into a feature matrix and a class label matrix in the same way as the training program we implement above −
import numpy as np
def main():
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=(0,1))
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=[2], ndmin=2)
接下来,是时候设置由我们的训练程序确定的权重和偏差的值了
Next, it is time to set the values of the weights and the bias that were determined by our training program −
print("Setting weights and bias values \n")
weights = np.array([0.0925, 1.1722], dtype=np.float32)
bias = np.array([-4.5400], dtype=np.float32)
N = len(features_mat)
features_dim = 2
接下来,我们的评估程序将通过如下遍历每个训练项来计算逻辑回归概率 -
Next our evaluation program will compute the logistic regression probability by walking through each training items as follows −
print("item pred_prob pred_label act_label result")
for i in range(0, N): # each item
x = features_mat[i]
z = 0.0
for j in range(0, features_dim):
z += x[j] * weights[j]
z += bias[0]
pred_prob = 1.0 / (1.0 + np.exp(-z))
pred_label = 0 if pred_prob < 0.5 else 1
act_label = labels_mat[i]
pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \
else ‘WRONG’
print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str))
现在让我们演示如何进行预测 -
Now let us demonstrate how to do prediction −
x = np.array([9.5, 4.5], dtype=np.float32)
print("\nPredicting class for age, education = ")
print(x)
z = 0.0
for j in range(0, features_dim):
z += x[j] * weights[j]
z += bias[0]
p = 1.0 / (1.0 + np.exp(-z))
print("Predicted p = " + str(p))
if p < 0.5: print("Predicted class = 0")
else: print("Predicted class = 1")
Complete prediction evaluation program
import numpy as np
def main():
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=(0,1))
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=[2], ndmin=2)
print("Setting weights and bias values \n")
weights = np.array([0.0925, 1.1722], dtype=np.float32)
bias = np.array([-4.5400], dtype=np.float32)
N = len(features_mat)
features_dim = 2
print("item pred_prob pred_label act_label result")
for i in range(0, N): # each item
x = features_mat[i]
z = 0.0
for j in range(0, features_dim):
z += x[j] * weights[j]
z += bias[0]
pred_prob = 1.0 / (1.0 + np.exp(-z))
pred_label = 0 if pred_prob < 0.5 else 1
act_label = labels_mat[i]
pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \
else ‘WRONG’
print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str))
x = np.array([9.5, 4.5], dtype=np.float32)
print("\nPredicting class for age, education = ")
print(x)
z = 0.0
for j in range(0, features_dim):
z += x[j] * weights[j]
z += bias[0]
p = 1.0 / (1.0 + np.exp(-z))
print("Predicted p = " + str(p))
if p < 0.5: print("Predicted class = 0")
else: print("Predicted class = 1")
if __name__ == "__main__":
main()
Output
设置权重和偏差值。
Setting weights and bias values.
Item pred_prob pred_label act_label result
0 0.3640 0 0 correct
1 0.7254 1 0 WRONG
2 0.2019 0 0 correct
3 0.3562 0 0 correct
4 0.0493 0 0 correct
5 0.1005 0 0 correct
6 0.7892 1 1 correct
7 0.8564 1 1 correct
8 0.9654 1 1 correct
9 0.7587 1 1 correct
10 0.3040 0 1 WRONG
11 0.7129 1 1 correct
Predicting class for age, education =
[9.5 4.5]
Predicting p = 0.526487952
Predicting class = 1