Apache Mxnet 简明教程
Apache MXNet - Python API Symbol
在本章中,我们将了解 MXNet 中的一个接口,该接口被称为 Symbol。
In this chapter, we will learn about an interface in MXNet which is termed as Symbol.
Mxnet.ndarray
Apache MXNet 的 Symbol API 是用于符号编程的接口。Symbol API 的特点是使用以下功能 −
Apache MXNet’s Symbol API is an interface for symbolic programming. Symbol API features the use of the following −
-
Computational graphs
-
Reduced memory usage
-
Pre-use function optimization
以下给出的示例演示了如何使用 MXNet 的 Symbol API 创建一个简单的表达式 −
The example given below shows how one can create a simple expression by using MXNet’s Symbol API −
通过普通 Python 列表使用 1-D 和 2-D“数组”的一组 NDArray −
An NDArray by using 1-D and 2-D ‘array’ from a regular Python list −
import mxnet as mx
# Two placeholders namely x and y will be created with mx.sym.variable
x = mx.sym.Variable('x')
y = mx.sym.Variable('y')
# The symbol here is constructed using the plus ‘+’ operator.
z = x + y
Output
您将看到以下输出 −
You will see the following output −
<Symbol _plus0>
Example
(x, y, z)
Output
输出如下 −
The output is given below −
(<Symbol x>, <Symbol y>, <Symbol _plus0>)
现在,让我们详细讨论 MXNet 的 ndarray API 的类、函数和参数。
Now let us discuss in detail about the classes, functions, and parameters of ndarray API of MXNet.
Classes
下表包含了 MXNet 的 Symbol API 的类 −
Following table consists of the classes of Symbol API of MXNet −
Class |
Definition |
Symbol(handle) |
This class namely symbol is the symbolic graph of the Apache MXNet. |
Functions and their parameters
以下是一些 mxnet.Symbol API 涵盖的重要函数及其参数 −
Following are some of the important functions and their parameters covered by mxnet.Symbol API −
Function and its Parameters |
Definition |
Activation([data, act_type, out, name]) |
It applies an activation function element-wise to the input. It supports relu, sigmoid, tanh, softrelu, softsign activation functions. |
BatchNorm([data, gamma, beta, moving_mean, …]) |
It is used for batch normalization. This function normalizes a data batch by mean and variance. It applies a scale gamma *and offset *beta. |
BilinearSampler([data, grid, cudnn_off, …]) |
This function applies bilinear sampling to input feature map. Actually it is the key of “Spatial Transformer Networks”. If you are familiar with remap function in OpenCV, the usage of this function is quite similar to that. The only difference is that it has the backward pass. |
BlockGrad([data, out, name]) |
As name specifies, this function stops gradient computation. It basically stops the accumulated gradient of the inputs from flowing through this operator in backward direction. |
cast([data, dtype, out, name]) |
This function will cast all elements of the input to a new type. |
This function will cast all elements of the input to a new type. |
This function, as name specified, returns a new symbol of given shape and type, filled with zeros. |
ones(shape[, dtype]) |
This function, as name specified return a new symbol of given shape and type, filled with ones. |
full(shape, val[, dtype]) |
This function, as name specified returns a new array of given shape and type, filled with the given value val. |
arange(start[, stop, step, repeat, …]) |
It will return evenly spaced values within a given interval. The values are generated within half open interval [start, stop) which means that the interval includes start but excludes stop. |
linspace(start, stop, num[, endpoint, name, …]) |
It will return evenly spaced numbers within a specified interval. Similar to the function arrange(), the values are generated within half open interval [start, stop) which means that the interval includes start but excludes stop. |
histogram(a[, bins, range]) |
As name implies, this function will compute the histogram of the input data. |
power(base, exp) |
As name implies, this function will return element-wise result of base element raised to powers from exp element. Both inputs i.e. base and exp, can be either Symbol or scalar. Here note that broadcasting is not allowed. You can use broadcast_pow if you want to use the feature of broadcast. |
SoftmaxActivation([data, mode, name, attr, out]) |
This function applies softmax activation to input. It is intended for internal layers. It is actually deprecated, we can use softmax() instead. |
Implementation Examples
在下面的示例中,我们将使用函数 power() ,它将返回 exp 元素中 base 元素求幂的逐元素结果:
In the example below we will be using the function power() which will return element-wise result of base element raised to the powers from exp element:
import mxnet as mx
mx.sym.power(3, 5)
Output
您将看到以下输出 −
You will see the following output −
243
Example
x = mx.sym.Variable('x')
y = mx.sym.Variable('y')
z = mx.sym.power(x, 3)
z.eval(x=mx.nd.array([1,2]))[0].asnumpy()
Output
生成以下输出:
This produces the following output −
array([1., 8.], dtype=float32)
Example
z = mx.sym.power(4, y)
z.eval(y=mx.nd.array([2,3]))[0].asnumpy()
Output
执行以上代码时,应该看到以下输出 −
When you execute the above code, you should see the following output −
array([16., 64.], dtype=float32)
Example
z = mx.sym.power(x, y)
z.eval(x=mx.nd.array([4,5]), y=mx.nd.array([2,3]))[0].asnumpy()
Output
输出如下:
The output is mentioned below −
array([ 16., 125.], dtype=float32)
在下面给出的示例中,我们将使用函数 SoftmaxActivation() (or softmax()) ,它将应用于输入,并适用于内部层。
In the example given below, we will be using the function SoftmaxActivation() (or softmax()) which will be applied to input and is intended for internal layers.
input_data = mx.nd.array([[2., 0.9, -0.5, 4., 8.], [4., -.7, 9., 2., 0.9]])
soft_max_act = mx.nd.softmax(input_data)
print (soft_max_act.asnumpy())
Output
您将看到以下输出 −
You will see the following output −
[[2.4258138e-03 8.0748333e-04 1.9912292e-04 1.7924475e-02 9.7864312e-01]
[6.6843745e-03 6.0796250e-05 9.9204916e-01 9.0463174e-04 3.0112563e-04]]
symbol.contrib
Contrib NDArray API 在 symbol.contrib 包中定义。它通常为新特性提供许多有用的实验性 API。此 API 作为社区的一个地方,社区可以在其中试用新特性。特性贡献者也将获得反馈。
The Contrib NDArray API is defined in the symbol.contrib package. It typically provides many useful experimental APIs for new features. This API works as a place for the community where they can try out the new features. The feature contributor will get the feedback as well.
Functions and their parameters
以下是一些 mxnet.symbol.contrib API 涵盖的重要函数及其参数:
Following are some of the important functions and their parameters covered by mxnet.symbol.contrib API −
Function and its Parameters |
Definition |
rand_zipfian(true_classes, num_sampled, …) |
This function draws random samples from an approximately Zipfian distribution. The base distribution of this function is Zipfian distribution. This function randomly samples num_sampled candidates and the elements of sampled_candidates are drawn from the base distribution given above. |
foreach(body, data, init_states) |
As name implies, this function runs a loop with user-defined computation over NDArrays on dimension 0. This function simulates a for loop and body has the computation for an iteration of the for loop. |
while_loop(cond, func, loop_vars[, …]) |
As name implies, this function runs a while loop with user-defined computation and loop condition. This function simulates a while loop that literately does customized computation if the condition is satisfied. |
cond(pred, then_func, else_func) |
As name implies, this function run an if-then-else using user-defined condition and computation. This function simulates an if-like branch which chooses to do one of the two customized computations according to the specified condition. |
getnnz([data, axis, out, name]) |
This function gives us the number of stored values for a sparse tensor. It also includes explicit zeros. It only supports CSR matrix on CPU. |
requantize([data, min_range, max_range, …]) |
This function requantize the given data that is quantized in int32 and the corresponding thresholds, into int8 using min and max thresholds either calculated at runtime or from calibration. |
index_copy([old_tensor, index_vector, …]) |
This function copies the elements of a new_tensor into the old_tensor by selecting the indices in the order given in index. The output of this operator will be a new tensor that contains the rest elements of old tensor and the copied elements of new tensor. |
interleaved_matmul_encdec_qk([queries, …]) |
This operator compute the matrix multiplication between the projections of queries and keys in multi-head attention use as encoder-decoder. The condition is that the inputs should be a tensor of projections of queries that follows the layout: (seq_length, batch_size, num_heads*, head_dim). |
Implementation Examples
在下面的示例中,我们将使用 rand_zipfian 函数从近似齐夫分布中抽取随机样本−
In the example below we will be using the function rand_zipfian for drawing random samples from an approximately Zipfian distribution −
import mxnet as mx
true_cls = mx.sym.Variable('true_cls')
samples, exp_count_true, exp_count_sample = mx.sym.contrib.rand_zipfian(true_cls, 5, 6)
samples.eval(true_cls=mx.nd.array([3]))[0].asnumpy()
Output
您将看到以下输出 −
You will see the following output −
array([4, 0, 2, 1, 5], dtype=int64)
Example
exp_count_true.eval(true_cls=mx.nd.array([3]))[0].asnumpy()
Output
输出如下:
The output is mentioned below −
array([0.57336551])
Example
exp_count_sample.eval(true_cls=mx.nd.array([3]))[0].asnumpy()
Output
您将看到以下输出 −
You will see the following output −
array([1.78103594, 0.46847373, 1.04183923, 0.57336551, 1.04183923])
在下面的示例中,我们将使用 while_loop 函数运行 while 循环以进行用户定义的计算和循环条件−
In the example below we will be using the function while_loop for running a while loop for user-defined computation and loop condition −
cond = lambda i, s: i <= 7
func = lambda i, s: ([i + s], [i + 1, s + i])
loop_vars = (mx.sym.var('i'), mx.sym.var('s'))
outputs, states = mx.sym.contrib.while_loop(cond, func, loop_vars, max_iterations=10)
print(outputs)
Output
输出如下:
The output is given below:
[<Symbol _while_loop0>]
Example
Print(States)
Output
生成以下输出:
This produces the following output −
[<Symbol _while_loop0>, <Symbol _while_loop0>]
在下面的示例中,我们将使用将 new_tensor 中的元素复制到 old_tensor 中的函数 index_copy 。
In the example below we will be using the function index_copy that copies the elements of new_tensor into the old_tensor.
import mxnet as mx
a = mx.nd.zeros((6,3))
b = mx.nd.array([[1,2,3],[4,5,6],[7,8,9]])
index = mx.nd.array([0,4,2])
mx.nd.contrib.index_copy(a, index, b)
Output
执行以上代码时,应该看到以下输出 −
When you execute the above code, you should see the following output −
[[1. 2. 3.]
[0. 0. 0.]
[7. 8. 9.]
[0. 0. 0.]
[4. 5. 6.]
[0. 0. 0.]]
<NDArray 6x3 @cpu(0)>
symbol.image
图像符号 API 在 symbol.image 包中定义。正如名称所示,它通常用于图像及其功能。
The Image Symbol API is defined in the symbol.image package. As name implies, it typically used for images and their features.
Functions and their parameters
以下是一些 mxnet.symbol.image API 涵盖的重要函数及其参数−
Following are some of the important functions and their parameters covered by mxnet.symbol.image API −
Function and its Parameters |
Definition |
adjust_lighting([data, alpha, out, name]) |
As name implies, this function adjusts the lighting level of the input. It follows the AlexNet style. |
crop([data, x, y, width, height, out, name]) |
With the help of this function we can crop an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by user. |
normalize([data, mean, std, out, name]) |
It will normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation(SD). |
random_crop([data, xrange, yrange, width, …]) |
Similar to crop(), it randomly crop an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by the user. It will upsample the result if src is smaller than the size. |
random_lighting([data, alpha_std, out, name]) |
As name implies, this function adds the PCA noise randomly. It also follows the AlexNet style. |
random_resized_crop([data, xrange, yrange, …]) |
It also crops an image randomly NDArray of shape (H x W x C) or (N x H x W x C) to the given size. It will upsample the result if src is smaller than the size. It will randomize the area and aspect ration as well. |
resize([data, size, keep_ratio, interp, …]) |
As name implies, this function will resize an image NDArray of shape (H x W x C) or (N x H x W x C) to the size given by user. |
to_tensor([data, out, name]) |
It converts an image NDArray of shape (H x W x C) or (N x H x W x C) with the values in the range [0, 255] to a tensor NDArray of shape (C x H x W) or (N x C x H x W) with the values in the range [0, 1]. |
Implementation Examples
在下面的示例中,我们将使用 to_tensor 函数将值在 [0, 255] 范围内的形状为 (H x W x C) 或 (N x H x W x C) 的图像 NDArray 转换为值在 [0, 1] 范围内的形状为 (C x H x W) 或 (N x C x H x W) 的张量 NDArray。
In the example below, we will be using the function to_tensor to convert image NDArray of shape (H x W x C) or (N x H x W x C) with the values in the range [0, 255] to a tensor NDArray of shape (C x H x W) or (N x C x H x W) with the values in the range [0, 1].
import numpy as np
img = mx.sym.random.uniform(0, 255, (4, 2, 3)).astype(dtype=np.uint8)
mx.sym.image.to_tensor(img)
Output
输出如下 −
The output is stated below −
<Symbol to_tensor4>
Example
img = mx.sym.random.uniform(0, 255, (2, 4, 2, 3)).astype(dtype=np.uint8)
mx.sym.image.to_tensor(img)
Output
输出如下所示:
The output is mentioned below:
<Symbol to_tensor5>
在下面的示例中,我们将使用 normalize() 函数对形状为 (C x H x W) 或 (N x C x H x W) 的张量使用 mean 和 standard deviation(SD) 进行归一化。
In the example below, we will be using the function normalize() to normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation(SD).
img = mx.sym.random.uniform(0, 1, (3, 4, 2))
mx.sym.image.normalize(img, mean=(0, 1, 2), std=(3, 2, 1))
Output
以下是代码的输出 −
Given below is the output of the code −
<Symbol normalize0>
Example
img = mx.sym.random.uniform(0, 1, (2, 3, 4, 2))
mx.sym.image.normalize(img, mean=(0, 1, 2), std=(3, 2, 1))
Output
输出如下所示−
The output is shown below −
<Symbol normalize1>
symbol.random
随机符号 API 在 symbol.random 包中定义。正如名称所示,它是 MXNet 的随机分配发生器 Symbol API。
The Random Symbol API is defined in the symbol.random package. As name implies, it is random distribution generator Symbol API of MXNet.
Functions and their parameters
以下是一些 mxnet.symbol.random API 涵盖的重要函数及其参数−
Following are some of the important functions and their parameters covered by mxnet.symbol.random API −
Function and its Parameters |
Definition |
uniform([low, high, shape, dtype, ctx, out]) |
It generates random samples from a uniform distribution. |
normal([loc, scale, shape, dtype, ctx, out]) |
It generates random samples from a normal (Gaussian) distribution. |
randn(*shape, **kwargs) |
It generates random samples from a normal (Gaussian) distribution. |
poisson([lam, shape, dtype, ctx, out]) |
It generates random samples from a Poisson distribution. |
exponential([scale, shape, dtype, ctx, out]) |
It generates samples from an exponential distribution. |
gamma([alpha, beta, shape, dtype, ctx, out]) |
It generates random samples from a gamma distribution. |
multinomial(data[, shape, get_prob, out, dtype]) |
It generates concurrent sampling from multiple multinomial distributions. |
negative_binomial([k, p, shape, dtype, ctx, out]) |
It generates random samples from a negative binomial distribution. |
generalized_negative_binomial([mu, alpha, …]) |
It generates random samples from a generalized negative binomial distribution. |
shuffle(data, **kwargs) |
It shuffles the elements randomly. |
randint(low, high[, shape, dtype, ctx, out]) |
It generates random samples from a discrete uniform distribution. |
exponential_like([data, lam, out, name]) |
It generates random samples from an exponential distribution according to the input array shape. |
gamma_like([data, alpha, beta, out, name]) |
It generates random samples from a gamma distribution according to the input array shape. |
generalized_negative_binomial_like([data, …]) |
It generates random samples from a generalized negative binomial distribution according to the input array shape. |
negative_binomial_like([data, k, p, out, name]) |
It generates random samples from a negative binomial distribution according to the input array shape. |
normal_like([data, loc, scale, out, name]) |
It generates random samples from a normal (Gaussian) distribution according to the input array shape. |
poisson_like([data, lam, out, name]) |
It generates random samples from a Poisson distribution according to the input array shape. |
uniform_like([data, low, high, out, name]) |
It generates random samples from a uniform distribution according to the input array shape. |
Implementation Examples
在下面的示例中,我们将使用 shuffle() 函数随机地随机排列元素。它将沿着第一个轴随机排列数组。
In the example below, we are going to shuffle the elements randomly using shuffle() function. It will shuffle the array along the first axis.
data = mx.nd.array([[0, 1, 2], [3, 4, 5], [6, 7, 8],[9,10,11]])
x = mx.sym.Variable('x')
y = mx.sym.random.shuffle(x)
y.eval(x=data)
Output
您将看到以下输出:
You will see the following output:
[
[[ 9. 10. 11.]
[ 0. 1. 2.]
[ 6. 7. 8.]
[ 3. 4. 5.]]
<NDArray 4x3 @cpu(0)>]
Example
y.eval(x=data)
Output
执行以上代码时,应该看到以下输出 −
When you execute the above code, you should see the following output −
[
[[ 6. 7. 8.]
[ 0. 1. 2.]
[ 3. 4. 5.]
[ 9. 10. 11.]]
<NDArray 4x3 @cpu(0)>]
在下面的示例中,我们将从广义负二项分布中提取随机样本。为此,将使用函数 generalized_negative_binomial() 。
In the example below, we are going to draw random samples from a generalized negative binomial distribution. For this will be using the function generalized_negative_binomial().
mx.sym.random.generalized_negative_binomial(10, 0.1)
Output
输出如下 −
The output is given below −
<Symbol _random_generalized_negative_binomial0>
symbol.sparse
稀疏符号 API 在 mxnet.symbol.sparse 程序包中定义。顾名思义,它提供了稀疏神经网络图和 CPU 上的自动微分。
The Sparse Symbol API is defined in the mxnet.symbol.sparse package. As name implies, it provides sparse neural network graphs and auto-differentiation on CPU.
Functions and their parameters
以下一些重要的函数(包括符号创建例程、符号操作例程、数学函数、三角函数、双曲函数、减少函数、舍入、幂、神经网络)及其参数由 mxnet.symbol.sparse API 涵盖:
Following are some of the important functions (includes Symbol creation routines, Symbol Manipulation routines, Mathematical functions, Trigonometric function, Hyberbolic functions, Reduce functions, Rounding, Powers, Neural Network) and their parameters covered by mxnet.symbol.sparse API −
Function and its Parameters |
Definition |
ElementWiseSum(*args, **kwargs) |
This function will add all input arguments element wise. For example, 𝑎𝑑𝑑_𝑛(𝑎1,𝑎2,…𝑎𝑛=𝑎1+𝑎2+⋯+𝑎𝑛). Here, we can see that add_n is potentially more efficient than calling add by n times. |
Embedding([data, weight, input_dim, …]) |
It will map the integer indices to vector representations i.e. embeddings. It actually maps words to real-valued vectors in high-dimensional space which is called word embeddings. |
LinearRegressionOutput([data, label, …]) |
It computes and optimizes for squared loss during backward propagation giving just output data during forward propagation. |
LogisticRegressionOutput([data, label, …]) |
Applies a logistic function which is also called the sigmoid function to the input. The function is computed as 1/1+exp (−x). |
MAERegressionOutput([data, label, …]) |
This operator computes mean absolute error of the input. MAE is actually a risk metric corresponding to the expected value of absolute error. |
abs([data, name, attr, out]) |
As name implies, this function will return element-wise absolute value of the input. |
adagrad_update([weight, grad, history, lr, …]) |
It is an update function for AdaGrad optimizer. |
adam_update([weight, grad, mean, var, lr, …]) |
It is an update function for Adam optimizer. |
add_n(*args, **kwargs) |
As name implies it will adds all input arguments element-wise. |
arccos([data, name, attr, out]) |
This function will returns element-wise inverse cosine of the input array. |
dot([lhs, rhs, transpose_a, transpose_b, …]) |
As name implies, it will give the dot product of two arrays. It will depend upon the input array dimension: 1-D: inner product of vectors 2-D: matrix multiplication N-D: A sum product over the last axis of the first input and the first axis of the second input. |
elemwise_add([lhs, rhs, name, attr, out]) |
As name implies it will add arguments element wise. |
elemwise_div([lhs, rhs, name, attr, out]) |
As name implies it will divide arguments element wise. |
elemwise_mul([lhs, rhs, name, attr, out]) |
As name implies it will Multiply arguments element wise. |
elemwise_sub([lhs, rhs, name, attr, out]) |
As name implies it will Subtract arguments element wise. |
exp([data, name, attr, out]) |
This function will return element wise exponential value of the given input. |
sgd_update([weight, grad, lr, wd, …]) |
It acts as an update function for Stochastic Gradient Descent optimizer. |
sigmoid([data, name, attr, out]) |
As name implies it will compute sigmoid of x element wise. |
sign([data, name, attr, out]) |
It will return the element wise sign of the given input. |
sin([data, name, attr, out]) |
As name implies, this function will computes the element wise sine of the given input array. |
Implementation Example
在下面的示例中,我们将使用 ElementWiseSum() 函数随机对元素进行洗牌。它将把整数索引映射到向量表示中,即单词嵌入。
In the example below, we are going to shuffle the elements randomly using ElementWiseSum() function. It will map integer indices to vector representations i.e. word embeddings.
input_dim = 4
output_dim = 5
Example
/* Here every row in weight matrix y represents a word. So, y = (w0,w1,w2,w3)
y = [[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[ 10., 11., 12., 13., 14.],
[ 15., 16., 17., 18., 19.]]
/* Here input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)]
x = [[ 1., 3.],
[ 0., 2.]]
/* Now, Mapped input x to its vector representation y.
Embedding(x, y, 4, 5) = [[[ 5., 6., 7., 8., 9.],
[ 15., 16., 17., 18., 19.]],
[[ 0., 1., 2., 3., 4.],
[ 10., 11., 12., 13., 14.]]]