Deep Learning With Keras 简明教程

Deep Learning with Keras - Quick Guide

Deep Learning with Keras - Introduction

近年来,在人工智能(AI)领域,深度学习已成为热门词汇。多年来,我们使用机器学习(ML)为机器赋予智能。近年来,深度学习因其在预测方面优于传统 ML 技术而变得更加流行。

Deep Learning has become a buzzword in recent days in the field of Artificial Intelligence (AI). For many years, we used Machine Learning (ML) for imparting intelligence to machines. In recent days, deep learning has become more popular due to its supremacy in predictions as compared to traditional ML techniques.

深度学习本质上是指使用大量数据训练人工神经网络(ANN)。在深度学习中,网络会自己学习,因此需要大量数据用于学习。而传统机器学习本质上是一组分析数据并从中学习的算法。然后,他们利用这种学习做出明智的决策。

Deep Learning essentially means training an Artificial Neural Network (ANN) with a huge amount of data. In deep learning, the network learns by itself and thus requires humongous data for learning. While traditional machine learning is essentially a set of algorithms that parse data and learn from it. They then used this learning for making intelligent decisions.

现在,谈论 Keras,它是一个高级神经网络 API,运行在 TensorFlow 之上——一个端到端的开源机器学习平台。使用 Keras,您可以轻松定义复杂的人工神经网络架构,以对您的海量数据进行实验。Keras 还支持 GPU,这对于处理海量数据和开发机器学习模型至关重要。

Now, coming to Keras, it is a high-level neural networks API that runs on top of TensorFlow - an end-to-end open source machine learning platform. Using Keras, you easily define complex ANN architectures to experiment on your big data. Keras also supports GPU, which becomes essential for processing huge amount of data and developing machine learning models.

在本教程中,您将学习在构建深度神经网络中使用 Keras。我们将研究用于教学的实际示例。手头的问题是使用经过深度学习训练的神经网络识别手写数字。

In this tutorial, you will learn the use of Keras in building deep neural networks. We shall look at the practical examples for teaching. The problem at hand is recognizing handwritten digits using a neural network that is trained with deep learning.

为了让您对深度学习更加兴奋,以下是在此处关于深度学习的谷歌趋势的屏幕截图:

Just to get you more excited in deep learning, below is a screenshot of Google trends on deep learning here −

screenshot google trends

正如您在图表中看到的,过去几年对深度学习的兴趣一直在稳步增长。计算机视觉、自然语言处理、语音识别、生物信息学、药物设计等许多领域已成功应用了深度学习。本教程将让您快速入门深度学习。

As you can see from the diagram, the interest in deep learning is steadily growing over the last several years. There are many areas such as computer vision, natural language processing, speech recognition, bioinformatics, drug design, and so on, where the deep learning has been successfully applied. This tutorial will get you quickly started on deep learning.

因此请继续往下读!

So keep reading!

Deep Learning with Keras - Deep Learning

如引言所述,深度学习是一个用大量数据训练人工神经网络的过程。一旦经过训练,网络将能够对看不见的数据给我们预测。在我进一步解释深度学习是什么之前,让我们快速浏览一下在训练神经网络时使用的一些术语。

As said in the introduction, deep learning is a process of training an artificial neural network with a huge amount of data. Once trained, the network will be able to give us the predictions on unseen data. Before I go further in explaining what deep learning is, let us quickly go through some terms used in training a neural network.

Neural Networks

人工神经网络的想法源自我们大脑中的神经网络。典型的神经网络由三层组成——输入层、输出层和隐藏层,如下图所示。

The idea of artificial neural network was derived from neural networks in our brain. A typical neural network consists of three layers — input, output and hidden layer as shown in the picture below.

neural networks

这也称为 {a0} 神经网络,因为它仅包含一个隐藏层。您可以在上述架构中添加更多隐藏层,以创建更复杂的架构。

This is also called a shallow neural network, as it contains only one hidden layer. You add more hidden layers in the above architecture to create a more complex architecture.

Deep Networks

下图显示了一个由四个隐藏层、一个输入层和一个输出层组成的深度网络。

The following diagram shows a deep network consisting of four hidden layers, an input layer and an output layer.

deep networks

随着向网络添加隐藏层的数量,其培训在所需资源和完全培训网络所需时间方面变得更加复杂。

As the number of hidden layers are added to the network, its training becomes more complex in terms of required resources and the time it takes to fully train the network.

Network Training

在你定义了网络架构后,可以为该网络进行训练,以进行特定类型的预测。训练网络是一个查找网络中每个链接的适当权重过程。在训练过程中,数据将通过各种隐藏层从输入层流向输出层。由于数据总是从输入到输出朝一个方向移动,我们将此网络称为前馈网络并将数据传播称为前向传播。

After you define the network architecture, you train it for doing certain kinds of predictions. Training a network is a process of finding the proper weights for each link in the network. During training, the data flows from Input to Output layers through various hidden layers. As the data always moves in one direction from input to output, we call this network as Feed-forward Network and we call the data propagation as Forward Propagation.

Activation Function

在每一层中,我们都会计算输入的加权和,并将其馈送至激活函数。激活函数为网络带来了非线性。它只是一个对输出进行离散化的数学函数。一些最常用的激活函数为 sigmoid、双曲、正切 (tanh)、ReLU 和 Softmax。

At each layer, we calculate the weighted sum of inputs and feed it to an Activation function. The activation function brings nonlinearity to the network. It is simply some mathematical function that discretizes the output. Some of the most commonly used activations functions are sigmoid, hyperbolic, tangent (tanh), ReLU and Softmax.

Backpropagation

反向传播是用于监督学习的算法。在反向传播中,误差从输出向输入层向后传播。给定误差函数,我们计算指定给每个连接的权重的误差函数的梯度。梯度的计算通过该网络向后进行。首先计算最后一层权重的梯度,最后计算第一层权重的梯度。

Backpropagation is an algorithm for supervised learning. In Backpropagation, the errors propagate backwards from the output to the input layer. Given an error function, we calculate the gradient of the error function with respect to the weights assigned at each connection. The calculation of the gradient proceeds backwards through the network. The gradient of the final layer of weights is calculated first and the gradient of the first layer of weights is calculated last.

在每一层中,梯度的部分计算在计算上一层的梯度时会得到重用。这被称为梯度下降。

At each layer, the partial computations of the gradient are reused in the computation of the gradient for the previous layer. This is called Gradient Descent.

在这个基于项目的教程中,您将定义前馈深度神经网络,并使用反向传播和梯度下降技术对她进行训练。幸运的是,Keras 为我们提供了所有的高级 API,以定义网络架构,并使用梯度下降对其进行训练。接下来,您将学习如何使用 Keras 执行此操作。

In this project-based tutorial you will define a feed-forward deep neural network and train it with backpropagation and gradient descent techniques. Luckily, Keras provides us all high level APIs for defining network architecture and training it using gradient descent. Next, you will learn how to do this in Keras.

Handwritten Digit Recognition System

在此迷你项目中,您将应用前面描述的技术。您将创建一个用于识别手写数字的深度学习神经网络。在任何机器学习项目中,第一个挑战都是收集数据。特别是,对于深度学习网络来说,您需要大量数据。幸运的是,对于我们试图解决的问题,有人已经创建了一个用于训练的数据集。这被称为 mnist,它作为 Keras 库的一部分提供。该数据集包含几个 28x28 像素的手写数字图像。您将在该数据集的主要部分训练您的模型,而其余数据将用于验证您的训练模型。

In this mini project, you will apply the techniques described earlier. You will create a deep learning neural network that will be trained for recognizing handwritten digits. In any machine learning project, the first challenge is collecting the data. Especially, for deep learning networks, you need humongous data. Fortunately, for the problem that we are trying to solve, somebody has already created a dataset for training. This is called mnist, which is available as a part of Keras libraries. The dataset consists of several 28x28 pixel images of handwritten digits. You will train your model on the major portion of this dataset and the rest of the data would be used for validating your trained model.

Project Description

mnist 数据集包含 70000 个手写数字图像。这里复制了几幅示例图像供您参考

The mnist dataset consists of 70000 images of handwritten digits. A few sample images are reproduced here for your reference

mnist dataset

每一幅图像大小为 28 x 28 像素,总共为 768 个像素,具有各个灰度级别。大多数像素偏向于黑色阴影,而只有少数像素偏向于白色。我们将把这些像素的分布放入一个数组或一个向量中。例如,数字 4 和 5 的典型图像的像素分布如下所示。

Each image is of size 28 x 28 pixels making it a total of 768 pixels of various gray scale levels. Most of the pixels tend towards black shade while only few of them are towards white. We will put the distribution of these pixels in an array or a vector. For example, the distribution of pixels for a typical image of digits 4 and 5 is shown in the figure below.

每一幅图像大小为 28 x 28 像素,总共为 768 个像素,具有各个灰度级别。大多数像素偏向于黑色阴影,而只有少数像素偏向于白色。我们将把这些像素的分布放入一个数组或一个向量中。例如,数字 4 和 5 的典型图像的像素分布如下所示。

Each image is of size 28 x 28 pixels making it a total of 768 pixels of various gray scale levels. Most of the pixels tend towards black shade while only few of them are towards white. We will put the distribution of these pixels in an array or a vector. For example, the distribution of pixels for a typical image of digits 4 and 5 is shown in the figure below.

project description

显然,您可以看到像素的分布(尤其是偏向白色调的像素)不同,这可以区分它们所代表的数字。我们将把这 784 个像素的分布作为输入馈送至我们的网络。该网络的输出将包含 10 个类别,表示介于 0 到 9 之间的数字。

Clearly, you can see that the distribution of the pixels (especially those tending towards white tone) differ, this distinguishes the digits they represent. We will feed this distribution of 784 pixels to our network as its input. The output of the network will consist of 10 categories representing a digit between 0 and 9.

我们的网络将包含 4 层 — 一层输入层、一层输出层和两层隐藏层。每个隐藏层将包含 512 个节点。每一层都与下一层完全连接。当我们训练网络时,我们将计算每个连接的权重。我们通过应用反向传播和梯度下降来训练网络,我们之前讨论过这些技术。

Our network will consist of 4 layers — one input layer, one output layer and two hidden layers. Each hidden layer will contain 512 nodes. Each layer is fully connected to the next layer. When we train the network, we will be computing the weights for each connection. We train the network by applying backpropagation and gradient descent that we discussed earlier.

Deep Learning with Keras - Setting up Project

有了这个背景,我们现在开始创建项目。

With this background, let us now start creating the project.

Setting Up Project

我们将通过 Anaconda 导航器,为项目使用 Jupyter 。由于我们的项目使用 TensorFlow 和 Keras,因此您需要在 Anaconda 设置中安装它们。要安装 Tensorflow,请在控制台窗口中运行以下命令:

We will use Jupyter through Anaconda navigator for our project. As our project uses TensorFlow and Keras, you will need to install those in Anaconda setup. To install Tensorflow, run the following command in your console window:

>conda install -c anaconda tensorflow

要安装 Keras,请使用以下命令 −

To install Keras, use the following command −

>conda install -c anaconda keras

您现在可以开始 Jupyter 了。

You are now ready to start Jupyter.

Starting Jupyter

启动 Anaconda 导航器时,您会看到以下打开屏幕。

When you start the Anaconda navigator, you would see the following opening screen.

starting jupyter

单击 ‘Jupyter’ 以启动它。屏幕将显示您硬盘驱动器上现有的项目(如果存在)。

Click ‘Jupyter’ to start it. The screen will show up the existing projects, if any, on your drive.

Starting a New Project

通过选择以下菜单选项,在 Anaconda 中启动一个新的 Python 3 项目 −

Start a new Python 3 project in Anaconda by selecting the following menu option −

File | New Notebook | Python 3

菜单选择屏幕截图如下所示,供您快速参考 −

The screenshot of the menu selection is shown for your quick reference −

starting new project

一个新的空白项目将显示在您的屏幕上,如下所示 −

A new blank project will show up on your screen as shown below −

digit recognition

通过单击并编辑默认名称 “UntitledXX” ,将项目名称更改为 DeepLearningDigitRecognition

Change the project name to DeepLearningDigitRecognition by clicking and editing on the default name “UntitledXX”.

Deep Learning with Keras - Importing Libraries

我们首先导入项目代码所需的各种库。

We first import the various libraries required by the code in our project.

Array Handling and Plotting

通常,我们将 numpy 用于数组处理并将 matplotlib 用于绘图。在我们的项目中,通过以下 import 语句导入了这些库

As typical, we use numpy for array handling and matplotlib for plotting. These libraries are imported in our project using the following import statements

import numpy as np
import matplotlib
import matplotlib.pyplot as plot

Suppressing Warnings

由于 Tensorflow 和 Keras 会持续修改,如果您未在项目中同步它们的适当版本,在运行时您会看到许多警告错误。为了让您可以专心学习而不被这些错误分散注意力,我们将抑制该项目中的所有警告。这是通过以下几行代码完成的:

As both Tensorflow and Keras keep on revising, if you do not sync their appropriate versions in the project, at runtime you would see plenty of warning errors. As they distract your attention from learning, we shall be suppressing all the warnings in this project. This is done with the following lines of code −

# silent all warnings
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
import warnings
warnings.filterwarnings('ignore')
from tensorflow.python.util import deprecation
deprecation._PRINT_DEPRECATION_WARNINGS = False

Keras

我们使用 Keras 库导入数据集。我们将使用 mnist 手写数字数据集。我们使用以下语句导入所需的软件包

We use Keras libraries to import dataset. We will use the mnist dataset for handwritten digits. We import the required package using the following statement

from keras.datasets import mnist

我们将使用 Keras 软件包定义我们的深度学习神经网络。我们导入 Sequential, Dense, DropoutActivation 软件包来定义网络架构。我们使用 load_model 软件包来保存并检索我们的模型。我们还使用 np_utils 来获得项目中需要的几个实用工具。这些导入通过以下程序语句完成:

We will be defining our deep learning neural network using Keras packages. We import the Sequential, Dense, Dropout and Activation packages for defining the network architecture. We use load_model package for saving and retrieving our model. We also use np_utils for a few utilities that we need in our project. These imports are done with the following program statements −

from keras.models import Sequential, load_model
from keras.layers.core import Dense, Dropout, Activation
from keras.utils import np_utils

运行此代码时,您将在控制台上看到一条消息,表明 Keras 在后端使用 TensorFlow。此阶段的屏幕截图如下所示:

When you run this code, you will see a message on the console that says that Keras uses TensorFlow at the backend. The screenshot at this stage is shown here −

keras

现在,由于我们已经有了项目所需的所有导入,我们将继续为我们的深度学习网络定义架构。

Now, as we have all the imports required by our project, we will proceed to define the architecture for our Deep Learning network.

Creating Deep Learning Model

我们的神经网络模型将包含一个线性层堆栈。要定义这样的模型,我们调用 Sequential 函数 -

Our neural network model will consist of a linear stack of layers. To define such a model, we call the Sequential function −

model = Sequential()

Input Layer

我们的网络中使用以下程序语句定义的输入层是第一层 −

We define the input layer, which is the first layer in our network using the following program statement −

model.add(Dense(512, input_shape=(784,)))

这将创建具有 512 个节点(神经元)和 784 个输入节点的层。这在下面的图中表示 −

This creates a layer with 512 nodes (neurons) with 784 input nodes. This is depicted in the figure below −

input layer

请注意,所有输入节点都完全连接到第 1 层,即每个输入节点都连接到第 1 层的所有 512 个节点。

Note that all the input nodes are fully connected to the Layer 1, that is each input node is connected to all 512 nodes of Layer 1.

接下来,我们需要添加第 1 层的输出激活函数。我们将使用 ReLU 作为我们的激活。使用以下程序语句添加激活函数 −

Next, we need to add the activation function for the output of Layer 1. We will use ReLU as our activation. The activation function is added using the following program statement −

model.add(Activation('relu'))

接下来,我们使用下面的语句添加 20% 的 Dropout。Dropout 是一种用于防止模型过拟合的技术。

Next, we add Dropout of 20% using the statement below. Dropout is a technique used to prevent model from overfitting.

model.add(Dropout(0.2))

此时,我们的输入层已完全定义。接下来,我们将添加一个隐藏层。

At this point, our input layer is fully defined. Next, we will add a hidden layer.

Hidden Layer

我们的隐藏层将包含 512 个节点。隐藏层的输入来自我们之前定义的输入层。所有节点都完全连接,如同之前的情况一样。隐藏层的输出将进入网络中的下一层,它将成为我们的最终输出层。我们将使用与前一层相同的 ReLU 激活,并且 Dropout 为 20%。在此处给出添加此层的代码 −

Our hidden layer will consist of 512 nodes. The input to the hidden layer comes from our previously defined input layer. All the nodes are fully connected as in the earlier case. The output of the hidden layer will go to the next layer in the network, which is going to be our final and output layer. We will use the same ReLU activation as for the previous layer and a dropout of 20%. The code for adding this layer is given here −

model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))

此时,网络可以可视化为以下内容 −

The network at this stage can be visualized as follows −

hidden layer

接下来,我们将向我们的网络添加最终层,即输出层。请注意,您可以使用与此处使用的类似代码添加任意数量的隐藏层。添加更多层会使网络难以训练;然而,在很多情况下(尽管不是全部)给出了更好的结果的明确优势。

Next, we will add the final layer to our network, which is the output layer. Note that you may add any number of hidden layers using the code similar to the one which you have used here. Adding more layers would make the network complex for training; however, giving a definite advantage of better results in many cases though not all.

Output Layer

输出层仅包含 10 个节点,因为我们希望将给定的图像分类为 10 个不同的数字。我们使用以下语句添加此层 −

The output layer consists of just 10 nodes as we want to classify the given images in 10 distinct digits. We add this layer, using the following statement −

model.add(Dense(10))

由于我们希望将输出分类为 10 个不同的单元,因此我们使用 softmax 激活。在 ReLU 的情况下,输出是二进制的。我们使用以下语句添加激活 −

As we want to classify the output in 10 distinct units, we use the softmax activation. In case of ReLU, the output is binary. We add the activation using the following statement −

model.add(Activation('softmax'))

此时,我们的网络可以如下图所示可视化 −

At this point, our network can be visualized as shown in the below diagram −

output layer

此时,我们的网络模型在软件中已完全定义。运行代码单元格,如果没有错误,您将在屏幕上收到一条确认消息,如下面的屏幕截图所示 −

At this point, our network model is fully defined in the software. Run the code cell and if there are no errors, you will get a confirmation message on the screen as shown in the screenshot below −

network model

接下来,我们需要编译模型。

Next, we need to compile the model.

Deep Learning with Keras - Compiling the Model

编译使用一个称为 compile 的单方法调用执行。

The compilation is performed using one single method call called compile.

model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

compile 方法需要多个参数。loss 参数指定为类型 'categorical_crossentropy' 。metrics 参数设置为 'accuracy' ,最后我们使用 adam 优化器训练网络。此阶段的输出如下所示:

The compile method requires several parameters. The loss parameter is specified to have type 'categorical_crossentropy'. The metrics parameter is set to 'accuracy' and finally we use the adam optimizer for training the network. The output at this stage is shown below −

compile method

现在,我们可以将数据馈送到我们的网络中。

Now, we are ready to feed in the data to our network.

Loading Data

如前所述,我们将使用 Keras 提供的 mnist 数据集。当我们将数据加载到我们的系统中时,我们将把它分成训练数据和测试数据。通过按如下方式调用 load_data 方法来加载数据 −

As said earlier, we will use the mnist dataset provided by Keras. When we load the data into our system, we will split it in the training and test data. The data is loaded by calling the load_data method as follows −

(X_train, y_train), (X_test, y_test) = mnist.load_data()

此阶段的输出如下所示:

The output at this stage looks like the following −

loading data

现在,我们将学习加载数据集的结构。

Now, we shall learn the structure of the loaded dataset.

提供给我们的数据是尺寸为 28 x 28 像素的图形图像,每个图像都包含一个介于 0 和 9 之间的单独数字。我们将在控制台上显示前十张图片。执行此操作的代码如下:

The data that is provided to us are the graphic images of size 28 x 28 pixels, each containing a single digit between 0 and 9. We will display the first ten images on the console. The code for doing so is given below −

# printing first 10 images
for i in range(10):

plot.subplot(3,5,i+1)
plot.tight_layout()
plot.imshow(X_train[i], cmap='gray', interpolation='none')
plot.title("Digit: {}".format(y_train[i]))
plot.xticks([])
plot.yticks([])

在 10 个计数的迭代循环中,我们在每次迭代上创建一个子图,并在其中显示 X_train 矢量中的图像。我们给每张图像使用 y_train 矢量中的相应标题命名。请注意, y_train 矢量包含 X_train 矢量中相应图像的实际值。通过使用两个方法 xticksyticks 在没有参数的情况下调用,来移除 x 和 y 轴标记。运行代码时,您将看到以下输出:

In an iterative loop of 10 counts, we create a subplot on each iteration and show an image from X_train vector in it. We title each image from the corresponding y_train vector. Note that the y_train vector contains the actual values for the corresponding image in X_train vector. We remove the x and y axes markings by calling the two methods xticks and yticks with null argument. When you run the code, you would see the following output −

examining data points

接下来,我们准备数据将其馈送到我们的网络中。

Next, we will prepare data for feeding it into our network.

Deep Learning with Keras - Preparing Data

在将数据馈送到我们的网络之前,必须将其转换为网络所需的格式。这称为为网络准备数据。它通常包括将多维输入转换为单维矢量并标准化数据点。

Before we feed the data to our network, it must be converted into the format required by the network. This is called preparing data for the network. It generally consists of converting a multi-dimensional input to a single-dimension vector and normalizing the data points.

Reshaping Input Vector

我们数据集中的图像由 28 x 28 像素组成。在将其馈送到我们的网络之前,这必须转换为大小为 28 * 28 = 784 的单维矢量。我们通过在矢量上调用 reshape 方法来执行此操作。

The images in our dataset consist of 28 x 28 pixels. This must be converted into a single dimensional vector of size 28 * 28 = 784 for feeding it into our network. We do so by calling the reshape method on the vector.

X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)

现在,我们的训练矢量将包含 60000 个数据点,每个数据点都包含大小为 784 的单维矢量。类似地,我们的测试矢量将包含 10000 个数据点,每个数据点都是大小为 784 的单维矢量。

Now, our training vector will consist of 60000 data points, each consisting of a single dimension vector of size 784. Similarly, our test vector will consist of 10000 data points of a single-dimension vector of size 784.

Normalizing Data

输入矢量当前包含的数据在 0 和 255 之间的离散值——灰度级。将这些像素值标准化到 0 和 1 之间有助于加快训练速度。由于我们要使用随机梯度下降,标准化数据还有助于降低陷入局部最优的可能性。

The data that the input vector contains currently has a discrete value between 0 and 255 - the gray scale levels. Normalizing these pixel values between 0 and 1 helps in speeding up the training. As we are going to use stochastic gradient descent, normalizing data will also help in reducing the chance of getting stuck in local optima.

要标准化数据,我们将其表示为浮点类型,并除以 255,如以下代码片段中所示:

To normalize the data, we represent it as float type and divide it by 255 as shown in the following code snippet −

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

让我们现在看看标准化数据的样子。

Let us now look at how the normalized data looks like.

Examining Normalized Data

要查看标准化数据,我们将调用直方图函数,如下所示:

To view the normalized data, we will call the histogram function as shown here −

plot.hist(X_train[0])
plot.title("Digit: {}".format(y_train[0]))

在这里,我们绘制 X_train 函数的第一个元素的直方图。我们还打印此数据点表示的数字。运行以上代码的输出显示如下:

Here, we plot the histogram of the first element of the X_train vector. We also print the digit represented by this data point. The output of running the above code is shown here −

normalized data

您会注意到,大量密点数的值接近于零。这些是图像中的黑色圆点,显然是图像的主要部分。其余靠近白色的灰度级点代表数字。您可以查看另一个数字的像素分布。以下代码打印训练集中索引为 2 的数字的直方图。

You will notice a thick density of points having value close to zero. These are the black dot points in the image, which obviously is the major portion of the image. The rest of the gray scale points, which are close to white color, represent the digit. You may check out the distribution of pixels for another digit. The code below prints the histogram of a digit at index of 2 in the training dataset.

plot.hist(X_train[2])
plot.title("Digit: {}".format(y_train[2])

运行以上代码的输出显示如下:

The output of running the above code is shown below −

training dataset

比较以上两幅图,您会注意到两幅图像中白色像素的分布不同,表示不同的数字——上方两幅图像中的“5”和“4”。

Comparing the above two figures, you will notice that the distribution of the white pixels in two images differ indicating a representation of a different digit - “5” and “4” in the above two pictures.

接下来,我们将检查我们整个训练集中数据的分布。

Next, we will examine the distribution of data in our full training dataset.

Examining Data Distribution

在我们的机器学习模型针对我们的数据集进行训练之前,我们应该知道我们数据集中唯一数字的分布。我们的图像表示了从 0 到 9 的 10 个不同的数字。我们想知道我们数据集中数字 0、1 等的数量。我们可以使用 NumPy 的 unique 方法获取此信息。

Before we train our machine learning model on our dataset, we should know the distribution of unique digits in our dataset. Our images represent 10 distinct digits ranging from 0 to 9. We would like to know the number of digits 0, 1, etc., in our dataset. We can get this information by using the unique method of Numpy.

使用以下命令打印唯一值的数目及每个数字出现的次数:

Use the following command to print the number of unique values and the number of occurrences of each one

print(np.unique(y_train, return_counts=True))

运行上述命令后,您将看到以下输出:

When you run the above command, you will see the following output −

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8), array([5923, 6742, 5958, 6131, 5842, 5421, 5918, 6265, 5851, 5949]))

它表明有 10 个不同的值——0 到 9。数字 0 出现 5923 次,数字 1 出现 6742 次,依此类推。输出的屏幕截图显示如下:

It shows that there are 10 distinct values — 0 through 9. There are 5923 occurrences of digit 0, 6742 occurrences of digit 1, and so on. The screenshot of the output is shown here −

distinct values

作为数据准备的最后一步,我们需要对数据进行编码。

As a final step in data preparation, we need to encode our data.

Encoding Data

我们的数据集中有十个类别。因此,我们使用独热编码将我们的输出编码到这十个类别中。我们使用 Numpy 实用程序的 to_categorial 方法来执行编码。对输出数据进行编码后,每个数据点都会转换为大小为 10 的单维向量。例如,数字 5 现在将表示为 [0,0,0,0,0,1,0,0,0,0]。

We have ten categories in our dataset. We will thus encode our output in these ten categories using one-hot encoding. We use to_categorial method of Numpy utilities to perform encoding. After the output data is encoded, each data point would be converted into a single dimensional vector of size 10. For example, digit 5 will now be represented as [0,0,0,0,0,1,0,0,0,0].

使用以下代码段对数据进行编码 -

Encode the data using the following piece of code −

n_classes = 10
Y_train = np_utils.to_categorical(y_train, n_classes)

您可以通过打印分类的 Y_train 向量的前 5 个元素来查看编码结果。

You may check out the result of encoding by printing the first 5 elements of the categorized Y_train vector.

使用以下代码打印前 5 个向量 -

Use the following code to print the first 5 vectors −

for i in range(5):
   print (Y_train[i])

您将看到以下输出 −

You will see the following output −

[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

第一个元素代表数字 5,第二个元素代表数字 0,依此类推。

The first element represents digit 5, the second represents digit 0, and so on.

最后,您还必须对测试数据进行分类,该操作使用以下语句完成 -

Finally, you will have to categorize the test data too, which is done using the following statement −

Y_test = np_utils.to_categorical(y_test, n_classes)

在此阶段,您的数据已完全准备好馈送到网络中。

At this stage, your data is fully prepared for feeding into the network.

接下来,是最重要的部分,即训练我们的网络模型。

Next, comes the most important part and that is training our network model.

Deep Learning with Keras - Training the Model

模型训练在一次方法调用中完成,称为 fit,它接受几个参数,如以下代码中所示 -

The model training is done in one single method call called fit that takes few parameters as seen in the code below −

history = model.fit(X_train, Y_train,
   batch_size=128, epochs=20,
   verbose=2,
   validation_data=(X_test, Y_test)))

传递给 fit 方法的前两个参数指定训练数据集的特征和输出。

The first two parameters to the fit method specify the features and the output of the training dataset.

epochs 设置为 20;我们假设训练将在最大 20 个 epoch 内收敛 - 迭代。训练后的模型根据最后一个参数中指定的内容在测试数据上进行验证。

The epochs is set to 20; we assume that the training will converge in max 20 epochs - the iterations. The trained model is validated on the test data as specified in the last parameter.

以下是运行上述命令的部分输出 -

The partial output of running the above command is shown here −

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
- 9s - loss: 0.2488 - acc: 0.9252 - val_loss: 0.1059 - val_acc: 0.9665
Epoch 2/20
- 9s - loss: 0.1004 - acc: 0.9688 - val_loss: 0.0850 - val_acc: 0.9715
Epoch 3/20
- 9s - loss: 0.0723 - acc: 0.9773 - val_loss: 0.0717 - val_acc: 0.9765
Epoch 4/20
- 9s - loss: 0.0532 - acc: 0.9826 - val_loss: 0.0665 - val_acc: 0.9795
Epoch 5/20
- 9s - loss: 0.0457 - acc: 0.9856 - val_loss: 0.0695 - val_acc: 0.9792

以下是此输出的屏幕截图,供您快速参考 -

The screenshot of the output is given below for your quick reference −

epochs

现在,当模型在我们的训练数据上进行训练时,我们将评估其性能。

Now, as the model is trained on our training data, we will evaluate its performance.

Evaluating Model Performance

要评估模型性能,我们调用 evaluate 方法如下 −

To evaluate the model performance, we call evaluate method as follows −

loss_and_metrics = model.evaluate(X_test, Y_test, verbose=2)

要评估模型性能,我们调用 evaluate 方法如下 −

To evaluate the model performance, we call evaluate method as follows −

loss_and_metrics = model.evaluate(X_test, Y_test, verbose=2)

我们将使用以下两个语句打印损失和准确率 −

We will print the loss and accuracy using the following two statements −

print("Test Loss", loss_and_metrics[0])
print("Test Accuracy", loss_and_metrics[1])

运行以上语句时,您会看到以下输出 −

When you run the above statements, you would see the following output −

Test Loss 0.08041584826191042
Test Accuracy 0.9837

这显示了 98% 的测试准确率,这对于我们来说是可接受的。这对于我们来说意味着,在 2% 的情况下,手写数字将无法正确分类。我们还将绘制准确率和损失指标,以查看模型在测试数据上的表现。

This shows a test accuracy of 98%, which should be acceptable to us. What it means to us that in 2% of the cases, the handwritten digits would not be classified correctly. We will also plot accuracy and loss metrics to see how the model performs on the test data.

Plotting Accuracy Metrics

我们在训练期间记录了 history 来获取准确性指标的绘图。以下代码将在每个时期绘制准确性。我们挑选训练数据准确率 (“acc”) 和验证数据准确率 (“val_acc”) 来作图。

We use the recorded history during our training to get a plot of accuracy metrics. The following code will plot the accuracy on each epoch. We pick up the training data accuracy (“acc”) and the validation data accuracy (“val_acc”) for plotting.

plot.subplot(2,1,1)
plot.plot(history.history['acc'])
plot.plot(history.history['val_acc'])
plot.title('model accuracy')
plot.ylabel('accuracy')
plot.xlabel('epoch')
plot.legend(['train', 'test'], loc='lower right')

输出图如下所示:

The output plot is shown below −

plotting accuracy metrics

正如您在图表中所看到的,准确性在前两个历元中迅速提高,表明网络正在快速学习。之后,曲线变平,表明训练模型无需太多历元。通常,如果训练数据准确性(“acc”)持续提高,而验证数据准确性(“val_acc”)变差,则表明遇到过度拟合。它表明模型开始记住数据。

As you can see in the diagram, the accuracy increases rapidly in the first two epochs, indicating that the network is learning fast. Afterwards, the curve flattens indicating that not too many epochs are required to train the model further. Generally, if the training data accuracy (“acc”) keeps improving while the validation data accuracy (“val_acc”) gets worse, you are encountering overfitting. It indicates that the model is starting to memorize the data.

我们还将绘制损失指标以检查模型的性能。

We will also plot the loss metrics to check our model’s performance.

Plotting Loss Metrics

同样,我们绘制训练(“loss”)和测试(“val_loss”)数据集上的损失。这是使用以下代码完成的:

Again, we plot the loss on both the training (“loss”) and test (“val_loss”) data. This is done using the following code −

plot.subplot(2,1,2)
plot.plot(history.history['loss'])
plot.plot(history.history['val_loss'])
plot.title('model loss')
plot.ylabel('loss')
plot.xlabel('epoch')
plot.legend(['train', 'test'], loc='upper right')

此代码的输出如下所示:

The output of this code is shown below −

plotting loss metrics

正如您在图表中所看到的,训练集上的损失在前两个历元中迅速下降。对于测试集,损失不会像训练集那样以相同速率下降,而是保持在多个历元中几乎持平。这意味着我们的模型可以很好地推广到看不见的数据。

As you can see in the diagram, the loss on the training set decreases rapidly for the first two epochs. For the test set, the loss does not decrease at the same rate as the training set, but remains almost flat for multiple epochs. This means our model is generalizing well to unseen data.

现在,我们将使用训练好的模型来预测测试数据中的数字。

Now, we will use our trained model to predict the digits in our test data.

Predicting on Test Data

预测未知数据中的数字很容易。您只需调用 modelpredict_classes 方法,方法是将其传递给包含未知数据点的向量。

To predict the digits in an unseen data is very easy. You simply need to call the predict_classes method of the model by passing it to a vector consisting of your unknown data points.

predictions = model.predict_classes(X_test)

该方法调用在一个向量中返回预测,这个向量可以针对实际值进行 0 和 1 的测试。这是通过以下两个语句完成的:

The method call returns the predictions in a vector that can be tested for 0’s and 1’s against the actual values. This is done using the following two statements −

correct_predictions = np.nonzero(predictions == y_test)[0]
incorrect_predictions = np.nonzero(predictions != y_test)[0]

最后,我们将使用以下两个程序语句打印预测正确和不正确的计数:

Finally, we will print the count of correct and incorrect predictions using the following two program statements −

print(len(correct_predictions)," classified correctly")
print(len(incorrect_predictions)," classified incorrectly")

运行代码时,您将获得以下输出:

When you run the code, you will get the following output −

9837 classified correctly
163 classified incorrectly

现在,您已经对模型进行了满意的训练,我们将保存它以备将来使用。

Now, as you have satisfactorily trained the model, we will save it for future use.

Deep Learning with Keras - Saving Model

我们将把训练好的模型保存在当前工作目录的 models 文件夹中的本地驱动器中。要保存模型,请运行以下代码:

We will save the trained model in our local drive in the models folder in our current working directory. To save the model, run the following code −

directory = "./models/"
name = 'handwrittendigitrecognition.h5'
path = os.path.join(save_dir, name)
model.save(path)
print('Saved trained model at %s ' % path)

运行代码后的输出如下所示:

The output after running the code is shown below −

saving model

现在,由于您已经保存了训练好的模型,以后可以将其用于处理未知数据。

Now, as you have saved a trained model, you may use it later on for processing your unknown data.

Loading Model for Predictions

要预测看不见的数据,首先需要将训练好的模型加载到内存中。这是通过使用以下命令完成的:

To predict the unseen data, you first need to load the trained model into the memory. This is done using the following command −

model = load_model ('./models/handwrittendigitrecognition.h5')

请注意,我们只是将 .h5 文件加载到内存中。这会在内存中设置整个神经网络,以及分配给每一层的权重。

Note that we are simply loading the .h5 file into memory. This sets up the entire neural network in memory along with the weights assigned to each layer.

现在,要在看不见的数据上进行预测,请将数据(可以是一个或多个项目)加载到内存中。预处理数据以满足模型的输入要求,如您在上述训练和测试数据上所做的那样。预处理后,将其馈入网络。模型将输出其预测。

Now, to do your predictions on unseen data, load the data, let it be one or more items, into the memory. Preprocess the data to meet the input requirements of our model as what you did on your training and test data above. After preprocessing, feed it to your network. The model will output its prediction.

Deep Learning with Keras - Conclusion

Keras 为创建深度神经网络提供了一个高级 API。在本教程中,您将学习如何创建一个深度神经网络,该网络经过训练以查找手写文本中的数字。为此,创建了一个多层网络。Keras 允许您在每一层定义您选择的激活函数。使用梯度下降,对训练数据对网络进行训练。在测试数据上测试了训练后网络在预测不可见数据方面的准确性。您学习了绘制准确性和错误度量。在网络完全训练后,您可以将网络模型另存为以备将来使用。

Keras provides a high level API for creating deep neural network. In this tutorial, you learned to create a deep neural network that was trained for finding the digits in handwritten text. A multi-layer network was created for this purpose. Keras allows you to define an activation function of your choice at each layer. Using gradient descent, the network was trained on the training data. The accuracy of the trained network in predicting the unseen data was tested on the test data. You learned to plot the accuracy and error metrics. After the network is fully trained, you saved the network model for future use.