Gen-ai 简明教程
Types of Generative Models
生成模型近年来获得了极大的普及。这些创新算法主要用于无监督学习,在处理数据的潜在分布以及生成复杂的输出方面很熟练,例如图像、音乐和自然语言,这些输出与原始训练数据相当。
Generative models have gained significant popularity in recent times. These innovative algorithms, mainly used for unsupervised learning, are proficient in dealing with the underlying distribution of data and generating complex output, such as images, music, and natural language, comparable to the original training data.
阅读本章以探索三种突出且最广泛使用的生成模型类型: Generative Adversarial Networks (GANs), Autoencoders 和 Variational Autoencoders (VAEs) 。
Read this chapter to explore three prominent and most widely used types of generative models: Generative Adversarial Networks (GANs), Autoencoders, and Variational Autoencoders (VAEs).
Generative Adversarial Networks (GANs)
生成对抗网络 (GAN) 由 Ian Goodfellow 和他的队友于 2014 年推出。GAN 是一种生成建模方法,基于能够生成看起来像原始训练数据的新复杂输出的深度神经网络架构。GAN 框架具有两个神经网络——“ Generator ”和“ Discriminator ”。
Generative Adversarial Networks (GANs) were introduced by Ian Goodfellow and his teammates in 2014. GANs, an approach to generative modeling, are based on deep neural network architecture that generates a new complex output that looks like the original training data. GAN framework has two neural networks- ‘Generator’ and ‘Discriminator’.
Working of GANs
让我们借助下面给出的图表了解 GAN 模型的工作原理:
Let’s understand the working of GAN model with the help of below given diagram −
如图表所示,GAN 具有两个主要组成部分: generator network 和 discriminative network 。
As depicted in the diagram, the GANs has two main components: a generator network and a discriminative network.
该过程首先向生成器提供一个随机种子/噪声向量。现在,生成器使用此输入创建新的合成样本。然后,这些生成的样本以及提供给判别网络的真实数据样本。
The process starts by providing the generator with a random seed/noise vector. Now, the generator uses this input to creates new, synthetic samples. Then, these generated samples along with the real data sample provided to the discriminative network.
然后,判别网络评估这些样本的真实性,即样本是真实还是假的。最后,判别器通过反向传播调整发生器的参数,对发生器的输出提供反馈。
The discriminative network then evaluates the realism of these samples, i.e., if the sample if real or fake. Finally, the discriminator provides feedback on the generator’s output by adjusting the generator’s parameters through backpropagation.
然后,生成器和判别器继续学习和彼此适应,直到生成器生成非常逼真的样本,可以欺骗判别器。
The generator and discriminator then continue to learn and adapt to each other, until the generator is producing highly realistic samples that can fool the discriminator.
Application of GANs
生成对抗网络 (GAN) 在各个领域都有应用。事实上,OpenAI 开发的特定模型 DALL-E 将 GAN 和 Transformer 的思想相结合,可以从文本描述中生成图像。
Generative Adversarial Networks (GANs) find their applications in various domains. In fact, DALL-E, a specific model developed by OpenAI, combines ideas from GANs and transformers to generate images from textual descriptions.
GAN 的其他一些应用包括以下内容:
Some other applications of GANs include the following −
-
Image Generation
-
Data Augmentation
-
Text-to-Image Synthesis
-
Video Generation and Prediction
-
Anomaly Detection
-
Face Aging and Rejuvenation
-
Style Transfer and Image Editing
Autoencoders
另一个广泛使用的生成模型是 autoencoders ,它彻底改变了计算机视觉到自然语言处理等各个领域。
Another widely used generative model that has revolutionized various domains, from computer vision to natural language processing is autoencoders.
自动编码器是一种 Artificial Neural Network (ANN) ,旨在以无监督的方式学习数据编码。用于分类和回归等监督学习任务的传统神经网络将输入数据映射到相应的输出标签。另一方面,自动编码器通过将高维输入数据解码成低维表示来学习重建输入数据。
An autoencoder is an Artificial Neural Network (ANN) designed to learn data encodings in an unsupervised manner. Traditional neural networks, used for supervised learning tasks such as classification and regression, map input data to corresponding output labels. On the other hand, autoencoders learn to reconstruct input data by decoding high-dimensional input data into lower-dimensional representation.
The Architecture of Autoencoders
自动编码器的架构包含三个主要部分:
The architecture of autoencoders consists of three main parts −
-
Encoder −It compresses the information into a dense encoding by mapping the input data to a lower-dimensional representation.
-
Bottleneck Layer (Latent Space) −In this layer the latent space representation captures the essential features of the input data in a compressed form.
-
Decoder −It decompresses the compressed representation back to the original input space by reconstructing it. The main aim of this module is to minimize reconstruction errors.
Variational Autoencoders
变分自动编码器 (VAE) 是一类生成模型,其基于我们上面研究过的自动编码器概念。
Variational autoencoders (VAEs) are a class of generative models that are based on the concept of autoencoders we have studied above.
传统自动编码器学习输入和潜在空间表示之间的确定性映射。另一方面,VAE 为潜在空间中的概率分布生成参数。此功能使 VAE 能够捕获输入数据样本的潜在概率分布。
Traditional autoencoders learn deterministic mappings between input and latent space representations. VAEs, on the other hand, generate parameters for probability distribution in the latent space. This feature enables VAEs to capture the underlying probability distribution of the input data samples.
Architecture and Components of VAEs
与自动编码器类似,VAE 的架构包含两个主要组件:编码器和解码器。在 VAE 中,编码器不使用自动编码器中的确定性映射,而是提出将概率建模到潜在空间中。
Like autoencoders, the architecture of VAEs consists of two main components: an encoder and a decoder. The encoder, in VAEs, rather than using deterministic mappings as in autoencoders, proposes probability modeling into the latent space.
以下给出了 VAE 的关键组件:
Given below are the key components of VAEs −
-
Encoder − It maps the input data samples to the parameters of a probability distribution in the latent space. After mapping the encoder gives mean and variance vectors of each data point.
-
Latent Space − This component represents the learned probabilities of the input sample data by the encoder.
-
Decoder − It reconstructs the data samples by using the samples from latent space. The aim of the decoder is to match the input data distribution.
Application of Variational Autoencoders (VAEs)
变分自动编码器 (VAE) 发现它们在各种域中的应用,如自动编码器。其中一些如下所列 −
Variational autoencoders (VAEs) find their application across various domains like autoencoders. Some of these are listed below −
-
Image Generation
-
Data Visualization
-
Feature Learning
-
Anomaly Detection
-
Natural Language Processing
在后续章节中,我们将详细讨论这些突出且最广泛使用的生成模型类型。
In the subsequent chapters, we will discuss these prominent and most widely used types of generative models in detail.
Conclusion
在本章中,我们对三种使用最广泛的生成模型进行了概述,即生成对抗网络 (GAN)、自动编码器和变分自动编码器 (VAE)。它们独特的性能促进了生成建模的发展。
In this chapter, we presented an overview of the three most widely used generative models namely, Generative Adversarial Networks (GANs), Autoencoders, and Variational Autoencoders (VAEs). Their unique capabilities contribute to the advancements of generative modeling.
GAN 凭借其对抗训练框架,可以生成看起来像原始训练数据的新复杂输出。我们讨论了 GAN 使用其框架的工作原理,该框架由两个神经网络组成: Generator 和 Discriminator 。
GANs, with their adversarial training framework, can generate a new complex output that looks like the original training data. We discussed GANs working using its framework that consists of two neural networks: Generator and Discriminator.
另一方面,自动编码器的目的是以无监督的方式学习数据编码。它们通过将高维输入数据解码为低维表示来重建输入数据。
Autoencoders, on the other hand, aim to learn data encodings in an unsupervised manner. They reconstruct input data by decoding high-dimensional input data into lower-dimensional representation.
变分自动编码器 (VAE) 引入了概率潜在空间表示。它们通过捕获样本输入数据的潜在概率分布,弥合了自动编码器和概率建模之间的差距。
Variational Autoencoders (VAEs) introduced the probabilistic latent space representations. They bridge the gap between autoencoders and probabilistic modeling by capturing the underlying probability distribution of sample input data.
无论是生成逼真的图像、学习有意义的数据表示还是探索概率潜在空间表示,GAN、自动编码器和 VAE 都在塑造人工智能驱动的生成技术的未来。
Whether it’s generating realistic images, learning meaningful representations of data, or exploring the probabilistic latent space representations, GANs, autoencoders, and VAEs shaping the future of AI-driven generative technologies.