Caffe2 简明教程
Caffe2 - Creating Your Own Network
在本教程中,你将学习在 Caffe2 中定义一个 single layer neural network (NN) 并针对随机生成的数据集运行它。我们将编写代码来以图形方式描述网络架构,打印输入、输出、权重和偏差值。要理解本教程,你必须熟悉 neural network architectures 及其 terms 和 mathematics 。
In this lesson, you will learn to define a single layer neural network (NN) in Caffe2 and run it on a randomly generated dataset. We will write code to graphically depict the network architecture, print input, output, weights, and bias values. To understand this lesson, you must be familiar with neural network architectures, its terms and mathematics used in them.
Network Architecture
让我们考虑想要构建如下所示的单层神经网络——
Let us consider that we want to build a single layer NN as shown in the figure below −
data:image/s3,"s3://crabby-images/7f03d/7f03dfe5c4f31ae686ec501758db4e071e28dcba" alt="network architecture"
从数学上说,此网络由以下 Python 代码表示——
Mathematically, this network is represented by the following Python code −
Y = X * W^T + b
其中 X, W, b 为张量, Y 为输出。我们将使用一些随机数据填充所有三个张量,运行网络并检查 Y 输出。为了定义网络和张量,Caffe2 提供了多个 Operator 函数。
Where X, W, b are tensors and Y is the output. We will fill all three tensors with some random data, run the network and examine the Y output. To define the network and tensors, Caffe2 provides several Operator functions.
Caffe2 Operators
在 Caffe2 中, Operator 是计算的基本单元。Caffe2 Operator 表示如下。
In Caffe2, Operator is the basic unit of computation. The Caffe2 Operator is represented as follows.
data:image/s3,"s3://crabby-images/dcb2f/dcb2f126dc068f4cad5ab03e60a315c565b513f2" alt="caffe operators"
Caffe2 提供了一个详尽的操作符列表。对于我们当前正在设计的网络,我们将使用名为 FC 的运算符,它计算将输入向量 X 传递到具有二维权重矩阵 W 和单维偏差向量的全连接网络中的结果。换句话说,它计算以下数学方程
Caffe2 provides an exhaustive list of operators. For the network that we are designing currently, we will use the operator called FC, which computes the result of passing an input vector X into a fully connected network with a two-dimensional weight matrix W and a single-dimensional bias vector b. In other words, it computes the following mathematical equation
Y = X * W^T + b
其中 X 的维度为 (M x k), W ,维度为 (n x k) , b 为 (1 x n) 。输出 Y 的维度将为 (M x n) ,其中 M 为批处理大小。
Where X has dimensions (M x k), W has dimensions (n x k) and b is (1 x n). The output Y will be of dimension (M x n), where M is the batch size.
对于向量 X 和 W ,我们将使用 GaussianFill 运算符来创建一些随机数据。为了生成偏差值 b ,我们将使用 ConstantFill 运算符。
For the vectors X and W, we will use the GaussianFill operator to create some random data. For generating bias values b, we will use ConstantFill operator.
我们现在将继续定义我们的网络。
We will now proceed to define our network.
Creating Network
首先,导入所需的包——
First of all, import the required packages −
from caffe2.python import core, workspace
接下来,通过如下调用 core.Net 来定义网络——
Next, define the network by calling core.Net as follows −
net = core.Net("SingleLayerFC")
网络名称指定为 SingleLayerFC 。在这一步,创建名为 net 的网络对象。到目前为止,它不包含任何层。
The name of the network is specified as SingleLayerFC. At this point, the network object called net is created. It does not contain any layers so far.
Creating Tensors
我们现在将创建我们的网络所需的三个向量。首先,我们将通过调用 GaussianFill 运算符来创建 X 张量,如下所示——
We will now create the three vectors required by our network. First, we will create X tensor by calling GaussianFill operator as follows −
X = net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0)
X 矢量的维度 2 x 3 ,平均数据值为 0,0,标准差为 1.0 。
The X vector has dimensions 2 x 3 with the mean data value of 0,0 and standard deviation of 1.0.
同样,我们如下创建 W 张量——
Likewise, we create W tensor as follows −
W = net.GaussianFill([], ["W"], mean=0.0, std=1.0, shape=[5, 3], run_once=0)
W 矢量的大小为 5 x 3 。
The W vector is of size 5 x 3.
最后,我们创建大小为 5 的偏差 b 矩阵。
Finally, we create bias b matrix of size 5.
b = net.ConstantFill([], ["b"], shape=[5,], value=1.0, run_once=0)
现在,来到代码中最重要的一部分,即对网络本身进行定义。
Now, comes the most important part of the code and that is defining the network itself.
Defining Network
我们在以下 Python 语句中对网络进行定义 −
We define the network in the following Python statement −
Y = X.FC([W, b], ["Y"])
我们对输入数据 X 调用 FC 运算符。权重在 W 中指定,偏差在 b 中指定。输出是 Y 。或者,你可以使用以下 Python 语句创建网络,这样更详细。
We call FC operator on the input data X. The weights are specified in W and bias in b. The output is Y. Alternatively, you may create the network using the following Python statement, which is more verbose.
Y = net.FC([X, W, b], ["Y"])
此时,网络刚刚创建。在至少运行一次网络之前,它不会包含任何数据。在运行网络之前,我们将检查其架构。
At this point, the network is simply created. Until we run the network at least once, it will not contain any data. Before running the network, we will examine its architecture.
Printing Network Architecture
Caffe2 在 JSON 文件中定义网络架构,可以通过在创建的 net 对象上调用 Proto 方法来对其进行检查。
Caffe2 defines the network architecture in a JSON file, which can be examined by calling the Proto method on the created net object.
print (net.Proto())
生成以下输出:
This produces the following output −
name: "SingleLayerFC"
op {
output: "X"
name: ""
type: "GaussianFill"
arg {
name: "mean"
f: 0.0
}
arg {
name: "std"
f: 1.0
}
arg {
name: "shape"
ints: 2
ints: 3
}
arg {
name: "run_once"
i: 0
}
}
op {
output: "W"
name: ""
type: "GaussianFill"
arg {
name: "mean"
f: 0.0
}
arg {
name: "std"
f: 1.0
}
arg {
name: "shape"
ints: 5
ints: 3
}
arg {
name: "run_once"
i: 0
}
}
op {
output: "b"
name: ""
type: "ConstantFill"
arg {
name: "shape"
ints: 5
}
arg {
name: "value"
f: 1.0
}
arg {
name: "run_once"
i: 0
}
}
op {
input: "X"
input: "W"
input: "b"
output: "Y"
name: ""
type: "FC"
}
如你在上述列表中所见,它首先定义运算符 X, W 和 b 。让我们举 W 的定义为例。 W 的类型指定为 GausianFill 。 mean 定义为浮点 0.0 ,标准偏差定义为浮点 1.0 ,而 shape 为 5 x 3 。
As you can see in the above listing, it first defines the operators X, W and b. Let us examine the definition of W as an example. The type of W is specified as GausianFill. The mean is defined as float 0.0, the standard deviation is defined as float 1.0, and the shape is 5 x 3.
op {
output: "W"
name: "" type: "GaussianFill"
arg {
name: "mean"
f: 0.0
}
arg {
name: "std"
f: 1.0
}
arg {
name: "shape"
ints: 5
ints: 3
}
...
}
检查 X 和 b 的定义,以便你了解。最后,让我们看看我们的单层网络定义,此处对它进行了复制
Examine the definitions of X and b for your own understanding. Finally, let us look at the definition of our single layer network, which is reproduced here
op {
input: "X"
input: "W"
input: "b"
output: "Y"
name: ""
type: "FC"
}
在此,网络类型为 FC (全连接), X, W, b 为输入, Y 为输出。此网络定义过于详细,对于大型网络来说,检查其内容会变得很乏味。幸运的是,Caffe2 为已创建的网络提供了图形化表示。
Here, the network type is FC (Fully Connected) with X, W, b as inputs and Y is the output. This network definition is too verbose and for large networks, it will become tedious to examine its contents. Fortunately, Caffe2 provides a graphical representation for the created networks.
Network Graphical Representation
要获取网络的图形化表示,请运行以下代码片段,它本质上只有两行 Python 代码。
To get the graphical representation of the network, run the following code snippet, which is essentially only two lines of Python code.
from caffe2.python import net_drawer
from IPython import display
graph = net_drawer.GetPydotGraph(net, rankdir="LR")
display.Image(graph.create_png(), width=800)
运行代码后,你将看到以下输出 −
When you run the code, you will see the following output −
data:image/s3,"s3://crabby-images/768c0/768c0177f00cfa7f61f1f5b59cca7358be7bbe3b" alt="graphical representation"
对于大型网络,图形化表示在可视化和调试网络定义错误方面非常有用。
For large networks, the graphical representation becomes extremely useful in visualizing and debugging network definition errors.
最后,现在是运行网络的时候了。
Finally, it is now time to run the network.
Running Network
你可以通过对 workspace 对象调用 RunNetOnce 方法来运行网络 −
You run the network by calling the RunNetOnce method on the workspace object −
workspace.RunNetOnce(net)
在运行网络一次后,所有随机生成的我们数据都会被创建,并馈送到网络中,并且将创建输出。在运行网络后创建的张量在 Caffe2 中被称为 blobs 。工作区包含你创建并存储在内存中的 blobs 。这与 Matlab 非常相似。
After the network is run once, all our data that is generated at random would be created, fed into the network and the output will be created. The tensors which are created after running the network are called blobs in Caffe2. The workspace consists of the blobs you create and store in memory. This is quite similar to Matlab.
在运行网络后,你可以使用以下 print 命令检查工作区包含的 blobs
After running the network, you can examine the blobs that the workspace contains using the following print command
print("Blobs in the workspace: {}".format(workspace.Blobs()))
您将看到以下输出 −
You will see the following output −
Blobs in the workspace: ['W', 'X', 'Y', 'b']
请注意,工作区包含三个输入 blob − X, W 和 b 。它还包含名为 Y 的输出 blob。现在让我们检查一下这些 blob 的内容。
Note that the workspace consists of three input blobs − X, W and b. It also contains the output blob called Y. Let us now examine the contents of these blobs.
for name in workspace.Blobs():
print("{}:\n{}".format(name, workspace.FetchBlob(name)))
您将看到以下输出 −
You will see the following output −
W:
[[ 1.0426593 0.15479846 0.25635982]
[-2.2461145 1.4581774 0.16827184]
[-0.12009818 0.30771437 0.00791338]
[ 1.2274994 -0.903331 -0.68799865]
[ 0.30834186 -0.53060573 0.88776857]]
X:
[[ 1.6588869e+00 1.5279824e+00 1.1889904e+00]
[ 6.7048723e-01 -9.7490678e-04 2.5114202e-01]]
Y:
[[ 3.2709925 -0.297907 1.2803618 0.837985 1.7562964]
[ 1.7633215 -0.4651525 0.9211631 1.6511179 1.4302125]]
b:
[1. 1. 1. 1. 1.]
请注意,由于所有输入都是随机创建的,因此机器上的数据,或者事实上每次运行网络时的生成数据都是不同的。你现在已经成功定义了一个网络,并在计算机上运行了它。
Note that the data on your machine or as a matter of fact on every run of the network would be different as all inputs are created at random. You have now successfully defined a network and run it on your computer.