Keras 简明教程

Keras - Layers

正如前面所学,Keras 层是 Keras 模型的主要构建块。每一层接收输入信息,进行一些计算,最后输出转换后的信息。一层输出将流入下一层作为其输入。让我们在此章节中了解有关层的完整详细信息。

As learned earlier, Keras layers are the primary building block of Keras models. Each layer receives input information, do some computation and finally output the transformed information. The output of one layer will flow into the next layer as its input. Let us learn complete details about layers in this chapter.

Introduction

一个 Keras 层需要 shape of the input (input_shape) 以便了解输入数据的结构,需要 initializer 设置每个输入的权重,最后需要激活器将输出转化为非线性。中间用约束限制输入数据权重的生成范围,而正则化函数会通过在优化过程中动态地对权重施加惩罚,来尝试优化层(以及模型)。

A Keras layer requires shape of the input (input_shape) to understand the structure of the input data, initializer to set the weight for each input and finally activators to transform the output to make it non-linear. In between, constraints restricts and specify the range in which the weight of input data to be generated and regularizer will try to optimize the layer (and the model) by dynamically applying the penalties on the weights during optimization process.

总而言之,Keras 层需要以下最低细节信息才能创建一个完整的层。

To summarise, Keras layer requires below minimum details to create a complete layer.

  1. Shape of the input data

  2. Number of neurons / units in the layer

  3. Initializers

  4. Regularizers

  5. Constraints

  6. Activations

让我们在下一章了解基本概念。在了解基本概念之前,让我们使用顺序模型 API 创建一个简单的 Keras 层,以便了解 Keras 模型和层的工作方式。

Let us understand the basic concept in the next chapter. Before understanding the basic concept, let us create a simple Keras layer using Sequential model API to get the idea of how Keras model and layer works.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers
from keras import regularizers
from keras import constraints

model = Sequential()

model.add(Dense(32, input_shape=(16,), kernel_initializer = 'he_uniform',
   kernel_regularizer = None, kernel_constraint = 'MaxNorm', activation = 'relu'))
model.add(Dense(16, activation = 'relu'))
model.add(Dense(8))

其中,

where,

  1. Line 1-5 imports the necessary modules.

  2. Line 7 creates a new model using Sequential API.

  3. Line 9 creates a new Dense layer and add it into the model. Dense is an entry level layer provided by Keras, which accepts the number of neurons or units (32) as its required parameter. If the layer is first layer, then we need to provide Input Shape, (16,) as well. Otherwise, the output of the previous layer will be used as input of the next layer. All other parameters are optional. First parameter represents the number of units (neurons). input_shape represent the shape of input data. kernel_initializer represent initializer to be used. he_uniform function is set as value. kernel_regularizer represent regularizer to be used. None is set as value. kernel_constraint represent constraint to be used. MaxNorm function is set as value. activation represent activation to be used. relu function is set as value.

  4. Line 10 creates second Dense layer with 16 units and set relu as the activation function.

  5. Line 11 creates final Dense layer with 8 units.

Basic Concept of Layers

让我们了解一下层的概念,以及 Keras 对每个概念的支持方式。

Let us understand the basic concept of layer as well as how Keras supports each concept.

Input shape

在机器学习中,文本、图像或视频等的各种输入数据将首先转换为数字数组,然后馈送到算法中。输入数字可能是单维数组、二维数组(矩阵)或多维数组。我们可以使用整数元组 shape 指定维度信息。例如, (4,2) 表示具有四行两列的矩阵。

In machine learning, all type of input data like text, images or videos will be first converted into array of numbers and then feed into the algorithm. Input numbers may be single dimensional array, two dimensional array (matrix) or multi-dimensional array. We can specify the dimensional information using shape, a tuple of integers. For example, (4,2) represent matrix with four rows and two columns.

>>> import numpy as np
>>> shape = (4, 2)
>>> input = np.zeros(shape)
>>> print(input)
[
   [0. 0.]
   [0. 0.]
   [0. 0.]
   [0. 0.]
]
>>>

类似地, (3,4,2) 是三维矩阵,具有三个 4x2 矩阵集合(两行和四列)。

Similarly, (3,4,2) three dimensional matrix having three collections of 4x2 matrix (two rows and four columns).

>>> import numpy as np
>>> shape = (3, 4, 2)
>>> input = np.zeros(shape)
>>> print(input)
[
   [[0. 0.] [0. 0.] [0. 0.] [0. 0.]]
   [[0. 0.] [0. 0.] [0. 0.] [0. 0.]]
   [[0. 0.] [0. 0.] [0. 0.] [0. 0.]]
]
>>>

要创建模型的第一层(或模型的输入层),应指定输入数据形状。

To create the first layer of the model (or input layer of the model), shape of the input data should be specified.

Initializers

在机器学习中,所有输入数据都将分配权重。 Initializers 模块提供不同的函数来设置这些初始权重。其中一些 Keras Initializer 函数如下 −

In Machine Learning, weight will be assigned to all input data. Initializers module provides different functions to set these initial weight. Some of the Keras Initializer function are as follows −

Zeros

为所有输入数据生成 0

Generates 0 for all input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Zeros()
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

其中, kernel_initializer 表示模型内核的初始化器。

Where, kernel_initializer represent the initializer for kernel of the model.

Ones

为所有输入数据生成 1

Generates 1 for all input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Ones()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

Constant

为所有输入数据生成用户指定的常数值(例如 5 )。

Generates a constant value (say, 5) specified by the user for all input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Constant(value = 0) model.add(
   Dense(512, activation = 'relu', input_shape = (784,), kernel_initializer = my_init)
)

value 表示常数値

where, value represent the constant value

RandomNormal

使用输入数据的正态分布生成数值。

Generates value using normal distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.RandomNormal(mean=0.0,
stddev = 0.05, seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

其中,

where,

  1. mean represent the mean of the random values to generate

  2. stddev represent the standard deviation of the random values to generate

  3. seed represent the values to generate random number

RandomUniform

使用输入数据的均匀分布生成数值。

Generates value using uniform distribution of input data.

from keras import initializers

my_init = initializers.RandomUniform(minval = -0.05, maxval = 0.05, seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

其中,

where,

  1. minval represent the lower bound of the random values to generate

  2. maxval represent the upper bound of the random values to generate

TruncatedNormal

使用输入数据的截尾正态分布生成数值。

Generates value using truncated normal distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.TruncatedNormal(mean = 0.0, stddev = 0.05, seed = None
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

VarianceScaling

基于该层输入形状和输出形状以及指定的缩放比例生成数值。

Generates value based on the input shape and output shape of the layer along with the specified scale.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.VarianceScaling(
   scale = 1.0, mode = 'fan_in', distribution = 'normal', seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   skernel_initializer = my_init))

其中,

where,

  1. scale represent the scaling factor

  2. mode represent any one of fan_in, fan_out and fan_avg values

  3. distribution represent either of normal or uniform

VarianceScaling

它使用以下公式查找正态分布的 stddev ,然后再使用正态分布查找权重,

It finds the stddev value for normal distribution using below formula and then find the weights using normal distribution,

stddev = sqrt(scale / n)

n 表示,

where n represent,

  1. number of input units for mode = fan_in

  2. number of out units for mode = fan_out

  3. average number of input and output units for mode = fan_avg

类似地,它使用以下公式查找均匀分布的限制,然后再使用均匀分布查找权重,

Similarly, it finds the limit for uniform distribution using below formula and then find the weights using uniform distribution,

limit = sqrt(3 * scale / n)

lecun_normal

利用输入数据的 lecun 正态分布来生成值。

Generates value using lecun normal distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.RandomUniform(minval = -0.05, maxval = 0.05, seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

它使用以下公式找到 stddev ,然后应用正态分布

It finds the stddev using the below formula and then apply normal distribution

stddev = sqrt(1 / fan_in)

其中, fan_in 表示输入单元的数量。

where, fan_in represent the number of input units.

lecun_uniform

利用输入数据的 lecun 均匀分布来生成值。

Generates value using lecun uniform distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.lecun_uniform(seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

它使用以下公式找到 limit ,然后应用均匀分布

It finds the limit using the below formula and then apply uniform distribution

limit = sqrt(3 / fan_in)

其中,

where,

  1. fan_in represents the number of input units

  2. fan_out represents the number of output units

glorot_normal

利用输入数据的 glorot 正态分布来生成值。

Generates value using glorot normal distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.glorot_normal(seed=None) model.add(
   Dense(512, activation = 'relu', input_shape = (784,), kernel_initializer = my_init)
)

它使用以下公式找到 stddev ,然后应用正态分布

It finds the stddev using the below formula and then apply normal distribution

stddev = sqrt(2 / (fan_in + fan_out))

其中,

where,

  1. fan_in represents the number of input units

  2. fan_out represents the number of output units

glorot_uniform

利用输入数据的 glorot 均匀分布来生成值。

Generates value using glorot uniform distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.glorot_uniform(seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

它使用以下公式找到 limit ,然后应用均匀分布

It finds the limit using the below formula and then apply uniform distribution

limit = sqrt(6 / (fan_in + fan_out))

其中,

where,

  1. fan_in represent the number of input units.

  2. fan_out represents the number of output units

he_normal

利用输入数据的 he 正态分布来生成值。

Generates value using he normal distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.RandomUniform(minval = -0.05, maxval = 0.05, seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

它使用以下公式找到 stddev,然后应用正态分布。

It finds the stddev using the below formula and then apply normal distribution.

stddev = sqrt(2 / fan_in)

其中, fan_in 表示输入单元的数量。

where, fan_in represent the number of input units.

he_uniform

利用输入数据的 he 均匀分布来生成值。

Generates value using he uniform distribution of input data.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.he_normal(seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

它使用以下公式找到 limit ,然后应用均匀分布。

It finds the limit using the below formula and then apply uniform distribution.

limit = sqrt(6 / fan_in)

其中, fan_in 表示输入单元的数量。

where, fan_in represent the number of input units.

Orthogonal

生成一个随机正交矩阵。

Generates a random orthogonal matrix.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Orthogonal(gain = 1.0, seed = None)
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init))

其中, gain 是矩阵的乘法因子。

where, gain represent the multiplication factor of the matrix.

Identity

生成单位矩阵。

Generates identity matrix.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Identity(gain = 1.0) model.add(
   Dense(512, activation = 'relu', input_shape = (784,), kernel_initializer = my_init)
)

Constraints

在机器学习中,将在优化过程中对参数(权重)设置约束。<>Constraints 模块提供了不同的功能来设置该层上的约束。某些约束功能如下所示。

In machine learning, a constraint will be set on the parameter (weight) during optimization phase. <>Constraints module provides different functions to set the constraint on the layer. Some of the constraint functions are as follows.

NonNeg

将权重限制为非负数。

Constrains weights to be non-negative.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import initializers

my_init = initializers.Identity(gain = 1.0) model.add(
   Dense(512, activation = 'relu', input_shape = (784,),
   kernel_initializer = my_init)
)

其中, kernel_constraint 表示层中要使用的约束。

where, kernel_constraint represent the constraint to be used in the layer.

UnitNorm

将权重限制为单位范数。

Constrains weights to be unit norm.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import constraints

my_constrain = constraints.UnitNorm(axis = 0)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_constraint = my_constrain))

MaxNorm

将权重限制为小于或等于给定值的范数。

Constrains weight to norm less than or equals to the given value.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import constraints

my_constrain = constraints.MaxNorm(max_value = 2, axis = 0)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_constraint = my_constrain))

其中,

where,

  1. max_value represent the upper bound

  2. axis represent the dimension in which the constraint to be applied. e.g. in Shape (2,3,4) axis 0 denotes first dimension, 1 denotes second dimension and 2 denotes third dimension

MinMaxNorm

将权重限制为指定最小值和最大值之间的范数。

Constrains weights to be norm between specified minimum and maximum values.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import constraints

my_constrain = constraints.MinMaxNorm(min_value = 0.0, max_value = 1.0, rate = 1.0, axis = 0)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_constraint = my_constrain))

其中, rate 表示应用权重约束的速率。

where, rate represent the rate at which the weight constrain is applied.

Regularizers

在机器学习中,正则化在优化阶段使用。在优化过程中对层参数应用一些惩罚。Keras 正则化模块提供以下函数来对层设置惩罚。正则化仅按层应用。

In machine learning, regularizers are used in the optimization phase. It applies some penalties on the layer parameter during optimization. Keras regularization module provides below functions to set penalties on the layer. Regularization applies per-layer basis only.

L1 Regularizer

它提供基于 L1 的正则化。

It provides L1 based regularization.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import regularizers

my_regularizer = regularizers.l1(0.)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_regularizer = my_regularizer))

其中, kernel_regularizer 表示应用权重约束的速率。

where, kernel_regularizer represent the rate at which the weight constrain is applied.

L2 Regularizer

它提供基于 L2 的正则化。

It provides L2 based regularization.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import regularizers

my_regularizer = regularizers.l2(0.)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_regularizer = my_regularizer))

L1 and L2 Regularizer

它提供基于 L1 和 L2 的正则化。

It provides both L1 and L2 based regularization.

from keras.models import Sequential
from keras.layers import Activation, Dense
from keras import regularizers

my_regularizer = regularizers.l2(0.)
model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,),
   kernel_regularizer = my_regularizer))

Activations

在机器学习中,激活函数是一种特殊函数,用于发现特定的神经元是否被激活。基本上,激活函数对输入数据执行非线性变换,从而使神经元能够更好地学习。神经元的输出取决于激活函数。

In machine learning, activation function is a special function used to find whether a specific neuron is activated or not. Basically, the activation function does a nonlinear transformation of the input data and thus enable the neurons to learn better. Output of a neuron depends on the activation function.

当你回想单一感知的概念时,感知器(神经元)的输出仅仅是激活函数的结果,激活函数接受所有输入与其对应权重的总和乘以其结果加上整体偏差(如果可用的话)。

As you recall the concept of single perception, the output of a perceptron (neuron) is simply the result of the activation function, which accepts the summation of all input multiplied with its corresponding weight plus overall bias, if any available.

result = Activation(SUMOF(input * weight) + bias)

因此,激活函数在模型的成功学习中发挥着重要的作用。Keras 在 activations 模块中提供了很多激活函数。让我们了解模块中所有可用的激活函数。

So, activation function plays an important role in the successful learning of the model. Keras provides a lot of activation function in the activations module. Let us learn all the activations available in the module.

linear

应用线性函数。不进行任何操作。

Applies Linear function. Does nothing.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'linear', input_shape = (784,)))

其中, activation 引用层的激活函数。它可以通过函数名称指定,该层将使用相应的激活器。

Where, activation refers the activation function of the layer. It can be specified simply by the name of the function and the layer will use corresponding activators.

elu

应用指数线性单元。

Applies Exponential linear unit.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'elu', input_shape = (784,)))

selu

应用比例指数线性单元。

Applies Scaled exponential linear unit.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'selu', input_shape = (784,)))

relu

应用整流线性单元。

Applies Rectified Linear Unit.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape = (784,)))

softmax

应用 Softmax 函数。

Applies Softmax function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'softmax', input_shape = (784,)))

softplus

应用 Softplus 函数。

Applies Softplus function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'softplus', input_shape = (784,)))

softsign

应用 Softsign 函数。

Applies Softsign function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'softsign', input_shape = (784,)))

tanh

应用双曲正切函数。

Applies Hyperbolic tangent function.

from keras.models import Sequential
from keras.layers import Activation, Dense
model = Sequential()
model.add(Dense(512, activation = 'tanh', input_shape = (784,)))

sigmoid

应用 Sigmoid 函数。

Applies Sigmoid function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'sigmoid', input_shape = (784,)))

hard_sigmoid

应用 Hard Sigmoid 函数。

Applies Hard Sigmoid function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'hard_sigmoid', input_shape = (784,)))

exponential

应用指数函数。

Applies exponential function.

from keras.models import Sequential
from keras.layers import Activation, Dense

model = Sequential()
model.add(Dense(512, activation = 'exponential', input_shape = (784,)))

Sr.No

Layers & Description

1

Dense Layer Dense layer is the regular deeply connected neural network layer.

2

Dropout Layers Dropout is one of the important concept in the machine learning.

3

Flatten Layers Flatten is used to flatten the input.

4

Reshape Layers Reshape is used to change the shape of the input.

5

Permute Layers Permute is also used to change the shape of the input using pattern.

6

RepeatVector Layers RepeatVector is used to repeat the input for set number, n of times.

7

Lambda Layers Lambda is used to transform the input data using an expression or function.

8

Convolution Layers Keras contains a lot of layers for creating Convolution based ANN, popularly called as Convolution Neural Network (CNN).

9

Pooling Layer It is used to perform max pooling operations on temporal data.

10

Locally connected layer Locally connected layers are similar to Conv1D layer but the difference is Conv1D layer weights are shared but here weights are unshared.

11

Merge Layer It is used to merge a list of inputs.

12

Embedding Layer It performs embedding operations in input layer.