Artificial Intelligence With Python 简明教程

AI with Python – Deep Learning

人工神经网络 (ANN) 是一种高效的计算系统，其核心主题借鉴了对生物神经网络进行类比。神经网络是机器学习的一个类型模型。在 20 世纪 80 年代末和 90 年代初，神经网络在架构方面取得了许多重大的进步。在本章中，你将进一步了解人工智能的一种方法，深度学习。

Artificial Neural Network (ANN) it is an efficient computing system, whose central theme is borrowed from the analogy of biological neural networks. Neural networks are one type of model for machine learning. In the mid-1980s and early 1990s, much important architectural advancements were made in neural networks. In this chapter, you will learn more about Deep Learning, an approach of AI.

深度学习源于十年来的爆炸性计算增长，成为该领域的强有力竞争者。因此，深度学习是一种特殊类型的机器学习，其算法受到人脑结构和功能的启发。

Deep learning emerged from a decade’s explosive computational growth as a serious contender in the field. Thus, deep learning is a particular kind of machine learning whose algorithms are inspired by the structure and function of human brain.

Machine Learning v/s Deep Learning

深度学习是当今最强大的机器学习技术。它之所以如此强大，是因为它们在学习如何解决问题的同时，也学习解决问题的最佳方式。下面给出了深度学习和机器学习的比较−

Deep learning is the most powerful machine learning technique these days. It is so powerful because they learn the best way to represent the problem while learning how to solve the problem. A comparison of Deep learning and Machine learning is given below −

Data Dependency

第一个区别点基于当数据规模增加时 DL 和 ML 的性能。当数据量较大时，深度学习算法表现非常好。

The first point of difference is based upon the performance of DL and ML when the scale of data increases. When the data is large, deep learning algorithms perform very well.

Machine Dependency

深度学习算法需要高端机器才能完美运行。另一方面，机器学习算法也可以在低端机器上运行。

Deep learning algorithms need high-end machines to work perfectly. On the other hand, machine learning algorithms can work on low-end machines too.

Feature Extraction

深度学习算法可以提取高级特征并尝试从中学习。另一方面，需要专家识别机器学习提取的大多数特征。

Deep learning algorithms can extract high level features and try to learn from the same too. On the other hand, an expert is required to identify most of the features extracted by machine learning.

Time of Execution

执行时间取决于算法中使用的大量参数。深度学习具有比机器学习算法更多的参数。因此，DL 算法的执行时间，尤其是训练时间，远多于 ML 算法。但 DL 算法的测试时间少于 ML 算法。

Execution time depends upon the numerous parameters used in an algorithm. Deep learning has more parameters than machine learning algorithms. Hence, the execution time of DL algorithms, specially the training time, is much more than ML algorithms. But the testing time of DL algorithms is less than ML algorithms.

Approach to Problem Solving

深度学习端到端地解决问题，而机器学习使用解决问题的传统方式，即将其分解为多个部分。

Deep learning solves the problem end-to-end while machine learning uses the traditional way of solving the problem i.e. by breaking down it into parts.

Convolutional Neural Network (CNN)

卷积神经网络与普通神经网络相同，因为它们也由具有可学习权重和偏差的神经元组成。普通神经网络忽略输入数据的结构，并且在将所有数据馈送到网络之前，将其转换为一维数组。此过程适用于常规数据，但是如果数据包含图像，则此过程可能会很麻烦。

Convolutional neural networks are the same as ordinary neural networks because they are also made up of neurons that have learnable weights and biases. Ordinary neural networks ignore the structure of input data and all the data is converted into 1-D array before feeding it into the network. This process suits the regular data, however if the data contains images, the process may be cumbersome.

CNN 很容易解决此问题。它们在处理图像时会考虑它们的二维结构，这使它们能够提取图像特有的属性。通过这种方式，CNN 的主要目标是从输入层的原始图像数据转到输出层的正确类别。普通神经网络和 CNN 之间的唯一区别在于输入数据的处理方式和层类型。

CNN solves this problem easily. It takes the 2D structure of the images into account when they process them, which allows them to extract the properties specific to images. In this way, the main goal of CNNs is to go from the raw image data in the input layer to the correct class in the output layer. The only difference between an ordinary NNs and CNNs is in the treatment of input data and in the type of layers.

Architecture Overview of CNNs

在架构上，普通神经网络接收输入并通过一系列隐藏层对其进行转换。每一层通过神经元的帮助连接到另一层。普通神经网络的主要缺点是它们不能很好地扩展到完整图像。

Architecturally, the ordinary neural networks receive an input and transform it through a series of hidden layer. Every layer is connected to the other layer with the help of neurons. The main disadvantage of ordinary neural networks is that they do not scale well to full images.

CNN 的架构具有排列在三个维度（称为宽度、高度和深度）中的神经元。当前层中的每个神经元都连接到前一层的输出中的一小块区域。这类似于在输入图像上叠加 𝑵×𝑵 滤镜。它使用 M 滤镜以确保获取所有细节。这些 M 滤镜是特征提取器，用于提取边缘、角等特征。

The architecture of CNNs have neurons arranged in 3 dimensions called width, height and depth. Each neuron in the current layer is connected to a small patch of the output from the previous layer. It is similar to overlaying a 𝑵×𝑵 filter on the input image. It uses M filters to be sure about getting all the details. These M filters are feature extractors which extract features like edges, corners, etc.

Layers used to construct CNNs

以下层用于构建 CNN：

Following layers are used to construct CNNs −

Input Layer − It takes the raw image data as it is.
Convolutional Layer − This layer is the core building block of CNNs that does most of the computations. This layer computes the convolutions between the neurons and the various patches in the input.
Rectified Linear Unit Layer − It applies an activation function to the output of the previous layer. It adds non-linearity to the network so that it can generalize well to any type of function.
Pooling Layer − Pooling helps us to keep only the important parts as we progress in the network. Pooling layer operates independently on every depth slice of the input and resizes it spatially. It uses the MAX function.
Fully Connected layer/Output layer *− This layer computes the output scores in the last layer. The resulting output is of the size *𝟏×𝟏×𝑳 , where L is the number training dataset classes.

Installing Useful Python Packages

您可以使用 Keras ，这是一种用 Python 编写的、可以在 TensorFlow、CNTK 或 Theno 上运行的高级神经网络 API。其兼容 Python 2.7-3.6。您可以从 https://keras.io/ 了解更多相关信息。

You can use Keras, which is an high level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK or Theno. It is compatible with Python 2.7-3.6. You can learn more about it from https://keras.io/.

使用以下命令安装 keras −

Use the following commands to install keras −

pip install keras

在 conda 环境中，您可以使用以下命令 −

On conda environment, you can use the following command −

conda install –c conda-forge keras

Building Linear Regressor using ANN

本节中，您将学习如何构建一个使用人工神经网络的线性回归器。您可以使用 KerasRegressor 来实现此目的。在此示例中，我们使用包含 13 个波士顿房产数字的波士顿房价数据集。以下是 Python 代码 −

In this section, you will learn how to build a linear regressor using artificial neural networks. You can use KerasRegressor to achieve this. In this example, we are using the Boston house price dataset with 13 numerical for properties in Boston. The Python code for the same is shown here −

导入所有必需的软件包，如所示 −

Import all the required packages as shown −

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold

现在，加载我们保存在本地目录中的数据集。

Now, load our dataset which is saved in local directory.

dataframe = pandas.read_csv("/Usrrs/admin/data.csv", delim_whitespace = True, header = None)
dataset = dataframe.values

现在，将数据分成输入和输出变量，即 X 和 Y −

Now, divide the data into input and output variables i.e. X and Y −

X = dataset[:,0:13]
Y = dataset[:,13]

由于我们使用的是基线神经网络，因此，定义模型 −

Since we use baseline neural networks, define the model −

def baseline_model():

现在，按如下方式创建模型 −

Now, create the model as follows −

model_regressor = Sequential()
model_regressor.add(Dense(13, input_dim = 13, kernel_initializer = 'normal',
   activation = 'relu'))
model_regressor.add(Dense(1, kernel_initializer = 'normal'))

接下来，编译模型 −

Next, compile the model −

model_regressor.compile(loss='mean_squared_error', optimizer='adam')
return model_regressor

现在，按如下方式修复随机种子以实现可重复性 −

Now, fix the random seed for reproducibility as follows −

seed = 7
numpy.random.seed(seed)

用作 scikit-learn 中回归估计器的 Keras 封装对象称为 KerasRegressor 。在本节中，我们将使用标准化数据集评估此模型。

The Keras wrapper object for use in scikit-learn as a regression estimator is called KerasRegressor. In this section, we shall evaluate this model with standardize data set.

estimator = KerasRegressor(build_fn = baseline_model, nb_epoch = 100, batch_size = 5, verbose = 0)
kfold = KFold(n_splits = 10, random_state = seed)
baseline_result = cross_val_score(estimator, X, Y, cv = kfold)
print("Baseline: %.2f (%.2f) MSE" % (Baseline_result.mean(),Baseline_result.std()))

以上显示的代码的输出将是模型在问题上的性能，而不见于的数据。这将是均方误差，包括跨验证评估的全部 10 个折叠的平均值和标准差。

The output of the code shown above would be the estimate of the model’s performance on the problem for unseen data. It will be the mean squared error, including the average and standard deviation across all 10 folds of the cross validation evaluation.

Image Classifier: An Application of Deep Learning

卷积神经网络 (CNN) 解决了图像分类问题，即输入图像属于哪个类别。您可以使用 Keras 深度学习库。请注意，我们使用的是以下链接中猫和狗图像的训练和测试数据集 https://www.kaggle.com/c/dogs-vs-cats/data 。

Convolutional Neural Networks (CNNs) solve an image classification problem, that is to which class the input image belongs to. You can use Keras deep learning library. Note that we are using the training and testing data set of images of cats and dogs from following link https://www.kaggle.com/c/dogs-vs-cats/data.

如所示，导入重要的 keras 库和模块 −

Import the important keras libraries and packages as shown −

名为顺序的以下模块将初始化神经网络作为顺序网络。

The following package called sequential will initialize the neural networks as sequential network.

from keras.models import Sequential

名为 Conv2D 的以下模块用于执行卷积运算，即 CNN 的第一步。

The following package called Conv2D is used to perform the convolution operation, the first step of CNN.

from keras.layers import Conv2D

名为 MaxPoling2D 的以下模块用于执行池化运算，即 CNN 的第二步。

The following package called MaxPoling2D is used to perform the pooling operation, the second step of CNN.

from keras.layers import MaxPooling2D

名为 Flatten 的以下模块是将所有所得二维数组转换成一个长的连续线性向量的过程。

The following package called Flatten is the process of converting all the resultant 2D arrays into a single long continuous linear vector.

from keras.layers import Flatten

名为 Dense 的以下模块用于执行神经网络的全连接，即 CNN 的第四步。

The following package called Dense is used to perform the full connection of the neural network, the fourth step of CNN.

from keras.layers import Dense

现在，创建一个顺序类的对象。

Now, create an object of the sequential class.

S_classifier = Sequential()

现在，下一步是编码卷积部分。

Now, next step is coding the convolution part.

S_classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

此处 relu 是整流函数。

Here relu is the rectifier function.

现在，CNN 的下一步是在卷积部分后对所得特征图进行池化运算。

Now, the next step of CNN is the pooling operation on the resultant feature maps after convolution part.

S-classifier.add(MaxPooling2D(pool_size = (2, 2)))

现在，通过使用展平将所有池化的图像转换成一个连续向量 −

Now, convert all the pooled images into a continuous vector by using flattering −

S_classifier.add(Flatten())

接下来，创建一个完全连接的层。

Next, create a fully connected layer.

S_classifier.add(Dense(units = 128, activation = 'relu'))

此处，128 是隐藏单元的数量。将隐藏单元的数量定义为 2 的幂次是一个常见的做法。

Here, 128 is the number of hidden units. It is a common practice to define the number of hidden units as the power of 2.

现在，初始化输出层如下 −

Now, initialize the output layer as follows −

S_classifier.add(Dense(units = 1, activation = 'sigmoid'))

现在，编译我们已经构建的 CNN −

Now, compile the CNN, we have built −

S_classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

此处优化程序参数是选择随机梯度下降算法，损失参数是选择损失函数，度量参数是选择性能度量。

Here optimizer parameter is to choose the stochastic gradient descent algorithm, loss parameter is to choose the loss function and metrics parameter is to choose the performance metric.

现在，执行图像增强，然后将图像拟合到神经网络 −

Now, perform image augmentations and then fit the images to the neural networks −

train_datagen = ImageDataGenerator(rescale = 1./255,shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)

training_set =
   train_datagen.flow_from_directory(”/Users/admin/training_set”,target_size =
      (64, 64),batch_size = 32,class_mode = 'binary')

test_set =
   test_datagen.flow_from_directory('test_set',target_size =
      (64, 64),batch_size = 32,class_mode = 'binary')

现在，将数据拟合到我们已经创建的模型 −

Now, fit the data to the model we have created −

classifier.fit_generator(training_set,steps_per_epoch = 8000,epochs =
25,validation_data = test_set,validation_steps = 2000)

此处，steps_per_epoch 具有训练图像的数量。

Here steps_per_epoch have the number of training images.

现在，该模型已培训，我们可以按如下方式将其用于预测：

Now as the model has been trained, we can use it for prediction as follows −

from keras.preprocessing import image

test_image = image.load_img('dataset/single_prediction/cat_or_dog_1.jpg',
target_size = (64, 64))

test_image = image.img_to_array(test_image)

test_image = np.expand_dims(test_image, axis = 0)

result = classifier.predict(test_image)

training_set.class_indices

if result[0][0] == 1:
prediction = 'dog'

else:
   prediction = 'cat'