Apache Mxnet 简明教程

Apache MXNet - NDArray

在本章中,我们将讨论 MXNet 名为 ndarray 的多维数组格式。

In this chapter, we will be discussing about MXNet’s multi-dimensional array format called ndarray.

Handling data with NDArray

首先,我们将了解如何使用 NDArray 处理数据。以下是其先决条件 −

First, we are going see how we can handle data with NDArray. Following are the prerequisites for the same −

Prerequisites

为了了解如何使用此多维数组格式处理数据,我们需要满足以下先决条件:

To understand how we can handle data with this multi-dimensional array format, we need to fulfil the following prerequisites:

  1. MXNet installed in a Python environment

  2. Python 2.7.x or Python 3.x

Implementation Example

我们借助以下示例了解基本功能 −

Let us understand the basic functionality with the help of an example given below −

首先需要如下从 MXNet 导入 MXNet 和 ndrray −

First, we need to import MXNet and ndarray from MXNet as follows −

import mxnet as mx
from mxnet import nd

导入必要的库后,我们将继续以下基本功能:

Once we import the necessary libraries, we will go with the following basic functionalities:

A simple 1-D array with a python list

Example

x = nd.array([1,2,3,4,5,6,7,8,9,10])
print(x)

Output

输出如下所述 −

The output is as mentioned below −

[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
<NDArray 10 @cpu(0)>

A 2-D array with a python list

Example

y = nd.array([[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]])
print(y)

Output

输出如下所示 −

The output is as stated below −

[[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]]
<NDArray 3x10 @cpu(0)>

Creating an NDArray without any initialisation

这里,我们将使用 .empty 函数创建包含 3 行和 4 列的矩阵。我们还将使用 .full 函数,它会将其他运算符用作想要填充到数组中的值。

Here, we will create a matrix with 3 rows and 4 columns by using .empty function. We will also use .full function, which will take an additional operator for what value you want to fill in the array.

Example

x = nd.empty((3, 4))
print(x)
x = nd.full((3,4), 8)
print(x)

Output

输出如下 −

The output is given below −

[[0.000e+00 0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 0.000e+00 2.887e-42 0.000e+00]
 [0.000e+00 0.000e+00 0.000e+00 0.000e+00]]
<NDArray 3x4 @cpu(0)>

[[8. 8. 8. 8.]
 [8. 8. 8. 8.]
 [8. 8. 8. 8.]]
<NDArray 3x4 @cpu(0)>

Matrix of all zeros with the .zeros function

Example

x = nd.zeros((3, 8))
print(x)

Output

输出如下 −

The output is as follows −

[[0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]]
<NDArray 3x8 @cpu(0)>

Matrix of all ones with the .ones function

Example

x = nd.ones((3, 8))
print(x)

Output

输出如下:

The output is mentioned below −

[[1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]
   [1. 1. 1. 1. 1. 1. 1. 1.]]
<NDArray 3x8 @cpu(0)>

Creating array whose values are sampled randomly

Example

y = nd.random_normal(0, 1, shape=(3, 4))
print(y)

Output

输出如下 −

The output is given below −

[[ 1.2673576 -2.0345826 -0.32537818 -1.4583491 ]
 [-0.11176403 1.3606371 -0.7889914 -0.17639421]
 [-0.2532185 -0.42614475 -0.12548696 1.4022992 ]]
<NDArray 3x4 @cpu(0)>

Finding dimension of each NDArray

Example

y.shape

Output

输出如下 −

The output is as follows −

(3, 4)

Finding the size of each NDArray

Example

y.size

Output

12

Finding the datatype of each NDArray

Example

y.dtype

Output

numpy.float32

NDArray Operations

在此部分中,我们将向你介绍 MXNet 的数组操作。NDArray 支持大量标准数学运算和就地运算。

In this section, we will introduce you to MXNet’s array operations. NDArray support large number of standard mathematical as well as In-place operations.

Standard Mathematical Operations

以下是 NDArray 支持的标准数学运算 −

Following are standard mathematical operations supported by NDArray −

Element-wise addition

首先需要如下从 MXNet 导入 MXNet 和 ndrray:

First, we need to import MXNet and ndarray from MXNet as follows:

import mxnet as mx
from mxnet import nd
x = nd.ones((3, 5))
y = nd.random_normal(0, 1, shape=(3, 5))
print('x=', x)
print('y=', y)
x = x + y
print('x = x + y, x=', x)

Output

输出与此一同给出 −

The output is given herewith −

x=
[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
<NDArray 3x5 @cpu(0)>
y=
[[-1.0554522 -1.3118273 -0.14674698 0.641493 -0.73820823]
[ 2.031364 0.5932667 0.10228804 1.179526 -0.5444829 ]
[-0.34249446 1.1086396 1.2756858 -1.8332436 -0.5289873 ]]
<NDArray 3x5 @cpu(0)>
x = x + y, x=
[[-0.05545223 -0.3118273 0.853253 1.6414931 0.26179177]
[ 3.031364 1.5932667 1.102288 2.1795259 0.4555171 ]
[ 0.6575055 2.1086397 2.2756858 -0.8332436 0.4710127 ]]
<NDArray 3x5 @cpu(0)>

Element-wise multiplication

Example

x = nd.array([1, 2, 3, 4])
y = nd.array([2, 2, 2, 1])
x * y

Output

您将看到以下输出−

You will see the following output−

[2. 4. 6. 4.]
<NDArray 4 @cpu(0)>

Exponentiation

Example

nd.exp(x)

Output

当运行代码时,您将看到以下输出:

When you run the code, you will see the following output:

[ 2.7182817 7.389056 20.085537 54.59815 ]
<NDArray 4 @cpu(0)>

Matrix transpose to compute matrix-matrix product

Example

nd.dot(x, y.T)

Output

以下是代码的输出 −

Given below is the output of the code −

[16.]
<NDArray 1 @cpu(0)>

In-place Operations

在上述示例中,每次我们运行一项操作时,我们都会分配一个新内存来承载其结果。

Every time, in the above example, we ran an operation, we allocated a new memory to host its result.

例如,如果我们编写 A = A+B,我们将解除引用 A 用于指向的矩阵,而将其指向新分配的内存。让我们借助 Python 的 id() 函数使用下面给出的示例来理解它−

For example, if we write A = A+B, we will dereference the matrix that A used to point to and instead point it at the newly allocated memory. Let us understand it with the example given below, using Python’s id() function −

print('y=', y)
print('id(y):', id(y))
y = y + x
print('after y=y+x, y=', y)
print('id(y):', id(y))

Output

执行后,您将收到以下输出:

Upon execution, you will receive the following output −

y=
[2. 2. 2. 1.]
<NDArray 4 @cpu(0)>
id(y): 2438905634376
after y=y+x, y=
[3. 4. 5. 5.]
<NDArray 4 @cpu(0)>
id(y): 2438905685664

事实上,我们还可以将结果分配给以前分配的数组,如下所示−

In fact, we can also assign the result to a previously allocated array as follows −

print('x=', x)
z = nd.zeros_like(x)
print('z is zeros_like x, z=', z)
print('id(z):', id(z))
print('y=', y)
z[:] = x + y
print('z[:] = x + y, z=', z)
print('id(z) is the same as before:', id(z))

Output

输出如下所示−

The output is shown below −

x=
[1. 2. 3. 4.]
<NDArray 4 @cpu(0)>
z is zeros_like x, z=
[0. 0. 0. 0.]
<NDArray 4 @cpu(0)>
id(z): 2438905790760
y=
[3. 4. 5. 5.]
<NDArray 4 @cpu(0)>
z[:] = x + y, z=
[4. 6. 8. 9.]
<NDArray 4 @cpu(0)>
id(z) is the same as before: 2438905790760

从上述输出中,我们可以看到 x+y 仍会分配一个临时缓冲区来存储结果,然后再将其复制到 z。所以现在,我们可以就地执行操作,从而更好地利用内存并避免临时缓冲区。为此,我们将按如下方式指定所有运算符支持的 out 关键字参数−

From the above output, we can see that x+y will still allocate a temporary buffer to store the result before copying it to z. So now, we can perform operations in-place to make better use of memory and to avoid temporary buffer. To do this, we will specify the out keyword argument every operator support as follows −

print('x=', x, 'is in id(x):', id(x))
print('y=', y, 'is in id(y):', id(y))
print('z=', z, 'is in id(z):', id(z))
nd.elemwise_add(x, y, out=z)
print('after nd.elemwise_add(x, y, out=z), x=', x, 'is in id(x):', id(x))
print('after nd.elemwise_add(x, y, out=z), y=', y, 'is in id(y):', id(y))
print('after nd.elemwise_add(x, y, out=z), z=', z, 'is in id(z):', id(z))

Output

在执行上述程序后,您将获得以下结果−

On executing the above program, you will get the following result −

x=
[1. 2. 3. 4.]
<NDArray 4 @cpu(0)> is in id(x): 2438905791152
y=
[3. 4. 5. 5.]
<NDArray 4 @cpu(0)> is in id(y): 2438905685664
z=
[4. 6. 8. 9.]
<NDArray 4 @cpu(0)> is in id(z): 2438905790760
after nd.elemwise_add(x, y, out=z), x=
[1. 2. 3. 4.]
<NDArray 4 @cpu(0)> is in id(x): 2438905791152
after nd.elemwise_add(x, y, out=z), y=
[3. 4. 5. 5.]
<NDArray 4 @cpu(0)> is in id(y): 2438905685664
after nd.elemwise_add(x, y, out=z), z=
[4. 6. 8. 9.]
<NDArray 4 @cpu(0)> is in id(z): 2438905790760

NDArray Contexts

在 Apache MXNet 中,每个数组都有一个上下文,一个上下文可能是 CPU,而其他上下文可能是多个 GPU。当我们在多台服务器上部署工作时,情况甚至会变得更糟。这就是我们为什么需要智能地将数组分配给上下文的。它将最大程度地减少在设备间传输数据所花费的时间。

In Apache MXNet, each array has a context and one context could be the CPU, whereas other contexts might be several GPUs. The things can get even worst, when we deploy the work across multiple servers. That’s why, we need to assign arrays to contexts intelligently. It will minimise the time spent transferring data between devices.

例如,尝试按如下方式初始化一个数组−

For example, try initialising an array as follows −

from mxnet import nd
z = nd.ones(shape=(3,3), ctx=mx.cpu(0))
print(z)

Output

执行以上代码时,应该看到以下输出 −

When you execute the above code, you should see the following output −

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
<NDArray 3x3 @cpu(0)>

我们可以使用 copyto() 方法将给定的 NDArray 从一个上下文复制到另一个上下文,如下所示−

We can copy the given NDArray from one context to another context by using the copyto() method as follows −

x_gpu = x.copyto(gpu(0))
print(x_gpu)

NumPy array vs. NDArray

我们都熟悉 NumPy 数组,但 Apache MXNet 提供了自己的名为 NDArray 的数组实现。实际上,它最初被设计为类似于 NumPy,但有一个关键区别−

We all the familiar with NumPy arrays but Apache MXNet offers its own array implementation named NDArray. Actually, it was initially designed to be similar to NumPy but there is a key difference −

关键区别在于在 NumPy 和 NDArray 中执行计算的方式。在 MXNet 中执行的每个 NDArray 操作都是异步且非阻塞的,这意味着当我们编写 c = a * b 这样的代码时,该函数将被推送到 Execution Engine ,它将启动计算。

The key difference is in the way calculations are executed in NumPy and NDArray. Every NDArray manipulation in MXNet is done in asynchronous and non-blocking way, which means that, when we write code like c = a * b, the function is pushed to the Execution Engine, which will start the calculation.

这里,a 和 b 都是 NDArrays。使用它的好处是,该函数立即返回,并且用户线程可以继续执行,尽管前面提到的计算可能尚未完成。

Here, a and b both are NDArrays. The benefit of using it is that, the function immediately returns back, and the user thread can continue execution despite the fact that the previous calculation may not have been completed yet.

Working of Execution Engine

如果我们讨论执行引擎的工作原理,它将构建计算图。计算图可能会重新排序或组合一些计算,但它始终遵循依赖顺序。

If we talk about the working of execution engine, it builds the computation graph. The computation graph may reorder or combine some calculations, but it always honors dependency order.

例如,如果在编程代码后面对“X”进行了其他操作,则执行引擎将在“X”的结果可用后开始执行这些操作。执行引擎将为用户处理一些重要工作,例如编写回调来启动后续代码的执行。

For example, if there are other manipulation with ‘X’ done later in the programming code, the Execution Engine will start doing them once the result of ‘X’ is available. Execution engine will handle some important works for the users, such as writing of callbacks to start execution of subsequent code.

在 Apache MXNet 中,借助 NDArray,我们只需访问结果变量即可获得计算的结果。在将计算结果分配给结果变量之前,代码流将被阻塞。通过这种方式,它提高了代码性能,同时仍支持命令式编程模式。

In Apache MXNet, with the help of NDArray, to get the result of computation we only need to access the resulting variable. The flow of the code will be blocked until the computation results are assigned to the resulting variable. In this way, it increases code performance while still supporting imperative programming mode.

Converting NDArray to NumPy Array

让我们了解如何在 MXNet 中将 NDArray 转换为 NumPy 数组。

Let us learn how can we convert NDArray to NumPy Array in MXNet.

Combining higher-level operator with the help of few lower-level operators

Combining higher-level operator with the help of few lower-level operators

有时,我们可以通过使用现有运算符来组装高级别的运算符。 np.full_like() 运算符就是其中的一个最佳示例,该运算符在 NDArray API 中不存在。它可以轻松地替换为现有运算符的组合,如下所示:

Sometimes, we can assemble a higher-level operator by using the existing operators. One of the best examples of this is, the np.full_like() operator, which is not there in NDArray API. It can easily be replaced with a combination of existing operators as follows:

from mxnet import nd
import numpy as np
np_x = np.full_like(a=np.arange(7, dtype=int), fill_value=15)
nd_x = nd.ones(shape=(7,)) * 15
np.array_equal(np_x, nd_x.asnumpy())

Output

我们将获得如下所示的输出:

We will get the output similar as follows −

True

Finding similar operator with different name and/or signature

Finding similar operator with different name and/or signature

在所有运算符中,有些运算符的名称略有不同,但在功能方面它们是相似的。 nd.ravel_index()np.ravel() 函数就是一个示例。同样,某些运算符的名称可能相似,但它们的签名不同。 np.split()nd.split() 是一个示例,它们是相似的。

Among all the operators, some of them have slightly different name, but they are similar in the terms of functionality. An example of this is nd.ravel_index() with np.ravel() functions. In the same way, some operators may have similar names, but they have different signatures. An example of this is np.split() and nd.split() are similar.

让我们通过以下编程示例来理解它:

Let’s understand it with the following programming example:

def pad_array123(data, max_length):
data_expanded = data.reshape(1, 1, 1, data.shape[0])
data_padded = nd.pad(data_expanded,
mode='constant',
pad_width=[0, 0, 0, 0, 0, 0, 0, max_length - data.shape[0]],
constant_value=0)
data_reshaped_back = data_padded.reshape(max_length)
return data_reshaped_back
pad_array123(nd.array([1, 2, 3]), max_length=10)

Output

输出如下 −

The output is stated below −

[1. 2. 3. 0. 0. 0. 0. 0. 0. 0.]
<NDArray 10 @cpu(0)>

Minimising impact of blocking calls

在某些情况下,我们必须使用 .asnumpy().asscalar() 方法,但这将强制 MXNet 阻塞执行,直到可以检索结果。我们可以通过在某些时刻调用 .asnumpy().asscalar() 方法来最大程度地减少阻塞调用的影响,我们认为此时已经完成了该值的计算。

In some of the cases, we have to use either .asnumpy() or .asscalar() methods, but this will force MXNet to block the execution, until the result can be retrieved. We can minimise the impact of a blocking call by calling .asnumpy() or .asscalar() methods in the moment, when we think the calculation of this value is already done.

Implementation Example

Example

from __future__ import print_function
import mxnet as mx
from mxnet import gluon, nd, autograd
from mxnet.ndarray import NDArray
from mxnet.gluon import HybridBlock
import numpy as np

class LossBuffer(object):
   """
   Simple buffer for storing loss value
   """

   def __init__(self):
      self._loss = None

   def new_loss(self, loss):
      ret = self._loss
      self._loss = loss
      return ret

      @property
      def loss(self):
         return self._loss

net = gluon.nn.Dense(10)
ce = gluon.loss.SoftmaxCELoss()
net.initialize()
data = nd.random.uniform(shape=(1024, 100))
label = nd.array(np.random.randint(0, 10, (1024,)), dtype='int32')
train_dataset = gluon.data.ArrayDataset(data, label)
train_data = gluon.data.DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=2)
trainer = gluon.Trainer(net.collect_params(), optimizer='sgd')
loss_buffer = LossBuffer()
for data, label in train_data:
   with autograd.record():
      out = net(data)
      # This call saves new loss and returns previous loss
      prev_loss = loss_buffer.new_loss(ce(out, label))
   loss_buffer.loss.backward()
   trainer.step(data.shape[0])
   if prev_loss is not None:
      print("Loss: {}".format(np.mean(prev_loss.asnumpy())))

Output

输出如下所示:

The output is cited below:

Loss: 2.3373236656188965
Loss: 2.3656985759735107
Loss: 2.3613128662109375
Loss: 2.3197104930877686
Loss: 2.3054862022399902
Loss: 2.329197406768799
Loss: 2.318927526473999