Gen-ai 简明教程

Discriminative vs Generative Models

人类思维启发了机器学习 (ML) 和深度学习 (DL) 技术,我们如何从经验中吸取教训,对当前和未来做出更好的选择,这些技术是学习领域中最具活力和不断变化的领域,虽然我们已经在很多方面应用这些技术,但可能性是无穷无尽的。

Human minds inspire Machine Learning (ML) and Deep Learning (DL) technologies, how we learn from our experience to make better choices in the present and future. These technologies are the most dynamic and ever-changing fields of study and while we are already using them in many ways, the possibilities are endless.

这些进步使得机器能够从过去的数据中学习,甚至能够预测看不见的数据输入,为了从原始数据中提取有意义的见解,机器依赖于数学、模型/算法和数据处理方法,有两种方法可以提高机器的效率:一种方法是增加数据量,另一种方法是开发新的、更强大的算法。

These advancements empower machines to learn from past data and predict even from unseen data inputs. To extract meaningful insights from raw data, machines depend upon mathematics, models/algorithms, and data processing methods. There are two ways we can enhance machine efficiency; one is to increase the volume of data, and another is to develop new and more robust algorithms.

获取新鲜的数据非常容易,因为每天都会产生数百万兆的数据,但为了处理如此庞大的数据,我们需要构建新的模型/算法或扩大现有模型/算法的规模,数学作为这些模型/算法的支柱,可大体上分为两组: Discriminative ModelsGenerative Models

Getting fresh data is very easy as quintillions of data are generated daily. But to work with such huge data, we need to either build new or scale up the existing models/algorithms. Mathematics serves as the backbone of these models/algorithms which can be broadly categorized into two groups namely, Discriminative Models and Generative Models.

在本章节中,我们将介绍判别式和生成式 ML 模型以及它们之间的核心差异。

In this chapter we will be looking at discriminative and generative ML models along with the core differences between them.

What are Discriminative Models?

判别式模型是 ML 模型,顾名思义,集中于使用概率估计和最大似然对数据集中不同类别的决策边界进行建模,这些类型的模型主要用于监督学习,也称为条件模型。

Discriminative models are ML models and, as the name suggests, concentrate on modeling the decision boundary between several classes of data using probability estimates and maximum likelihood. These types of models, mainly used for supervised learning, are also known as conditional models.

判别式模型不受异常值的影响很大,虽然这使它们成为比生成式模型更好的选择,但也带来了可能是一个重大缺点的错误分类问题。

Discriminative models are not much affected by the outliers. Although this makes them a better choice than generative models, it also leads to misclassification problem which can be a big drawback.

从数学角度来看, training a classifier 的过程涉及估计以下内容:

From a mathematical perspective, the process of training a classifier involves estimating either,

  1. A function represented as f : X → Y, or

  2. The probability P(Y│X).

然而,判别式分类器 -

However, discriminative classifiers −

  1. Assume a particular functional form of probability P(Y|X), and

  2. Directly estimate the parameters of probability P(Y|X) from the training dataset.

下面讨论一些广泛使用的判别模型的示例 −

Discussed below are some examples of widely used discriminative models −

Logistic Regression

逻辑回归是一种用于二进制分类任务的统计技术。它使用逻辑函数对因变量和一个或多个自变量之间的关系进行建模。它产生 0 到 1 之间的一个输出。

Logistic Regression is a statistical technique used for binary classification tasks. It models the relationship between the dependent variable and one or more independent variables using the logistic function. It produces an output between 0 and 1.

逻辑回归可用于各种问题的分类,如癌症检测、糖尿病预测、垃圾邮件检测等。

Logistic regression can be used for various classifications of problems like cancer detection, diabetes prediction, spam detection, etc.

Support Vector Machines

支持向量机 (SVM) 是一种功能强大且灵活的有监督机器学习算法,在回归和分类场景中都有应用。支持向量使用决策边界将 n 维数据空间划分为多个超平面中的类别。

A support vector machine (SVM) is a powerful yet flexible supervised ML algorithm with applications in regression as well as classification scenarios. Support vectors divide an n-dimensional data space into numerous classes in a hyperplane using decision boundaries.

K-nearest Neighbor (KNN)

KNN 是一种有监督机器学习算法,它使用特征相似性来预测新数据点的值。分配给新数据点的值取决于它们与训练集中点的匹配程度。

KNN is a supervised ML algorithm that uses feature similarity to predict new data points value. The values assigned to new data points depend on how closely they match the points in the training set.

决策树、神经网络、条件随机场 (CRF)、随机森林是常用判别模型的其他一些示例。

Decision trees, neural nets, conditional random field (CRF), random forest are few other examples of the commonly used discriminative models.

What are Generative Models?

生成模型是机器学习模型,顾名思义,其目的是捕获数据的底层分布,并生成与原始训练数据相当的新数据。这些类型的模型主要用于无监督学习,被归类为能够生成新数据实例的统计模型的一类。

Generative models are ML models and, as the name suggests, aim to capture the underlying distribution of data, and generate new data comparable to the original training data. These types of models, mainly used for unsupervised learning, are categorized as a class of statistical models capable of generating new data instances.

与判别模型相比,生成模型唯一的缺点是它们容易受到异常值的影响。

The only drawback of generative models, when compared to discriminative models, is that they are prone to outliers.

如上所述,从数学角度来看,训练分类器包括估计以下内容:

As discussed above, from a mathematical perspective, the process of training a classifier involves estimating either,

  1. A function represented as f : X → Y, or

  2. The probability P(Y│X).

但是,生成分类器 −

However, generative classifiers −

  1. Assume a particular functional form for the probabilities such as P(Y), P(X|Y)

  2. Directly estimate the parameters of probability such as P(X│Y), P(Y) from the training dataset.

  3. Calculates the posterior probability P(Y|X) using the Bayes’ Theorem.

以下是广泛使用的生成模型的一些示例 −

Highlighted below are some examples of widely used generative models −

Bayesian Network

贝叶斯网络(也称为贝叶斯网络)是一种概率图模型,它使用有向无环图 (DAG) 来表示变量之间的关系。它在各种领域有许多应用,例如医疗保健、金融和自然语言处理,用于决策制定、风险评估和预测等任务。

A Bayesian Network, also known as Bayes’ network, is a probabilistic graphical model that represents relationships between variables using a directed acyclic graph (DAG). It has many applications in various fields such as healthcare, finance, and natural language processing for tasks like decision-making, risk assessment, and prediction.

Generative Adversarial Network (GAN)

这些模型基于深度神经网络架构,该架构由生成器和判别器两个主要组件组成。生成器训练并创建新数据实例,并且判别器将这些生成的数据评估为真实或虚假实例。

These are based on deep neural network architecture consisting of two main components namely a generator and a discriminator. The generator trains and creates new data instances and the discriminator evaluates these generated data into real or fake instances.

Variational Autoencoders (VAEs)

这些模型是一类自动编码器,经过训练可以学习输入数据的概率潜在表示。它通过从学习到的概率分布中进行采样,学习生成类似于输入数据的新样本。VAE 对于从文本描述中生成图像等任务很有用,如 DALL-E-3 中所示,或者编写类似 ChatGPT 的类人文本响应。

These models are a type of autoencoder, trained to learn a probabilistic latent representation of the input data. It learns to generate new samples like the input data by sampling from a learned probability distribution. VAEs are useful for tasks like generating images from text descriptions, as seen in DALL-E-3, or crafting human-like text responses like ChatGPT.

自回归模型、朴素贝叶斯、马尔可夫随机场、隐藏马尔可夫模型 (HMM)、潜在狄利克雷分配 (LDA) 是常用生成模型的其他一些示例。

Autoregressive model, Naïve Bayes, Markov random field, Hidden Markov model (HMM), Latent Dirichlet Allocation (LDA) are few other examples of the commonly used generative models.

discriminative vs generative models

Difference Between Discriminative and Generative Models

数据科学家和机器学习专家需要了解这两种类型模型之间的差异,以便为特定任务选择最合适的模型。

Data scientists and machine learning experts need to understand the differences between these two types of models to choose the most suitable one for a particular task.

下表描绘了判别模型与生成模型之间的核心差异 -

The table below depicts the core differences between discriminative and generative models −

Characteristic

Discriminative Models

Generative Models

Objective

Focus on learning the boundary between different classes directly from the data. Their primary objective is to classify input data accurately based on the learned decision boundary.

Aim to understand the underlying data distribution and generate new data points that resemble the training data. They focus on modeling the process of data generation, allowing them to create synthetic data instances.

Probability Distribution

Estimates the parameters of probability P(Y

X) from the training dataset.

Calculates the posterior probability P(Y

X) using the Bayes’ Theorem.

Handling Outliers

Relatively robust to outliers

Prone to outliers

Property

They do not possess generative properties.

They possess discriminative properties.

Applications

Commonly used in classification tasks, such as image recognition and sentiment analysis.

Commonly used in tasks like data generation, anomaly detection, and data augmentation, beyond traditional classification tasks.

Examples

Conclusion

判别模型在类别之间创建边界,使其成为分类任务的理想选择。相比之下,生成模型了解底层数据分布并生成新样本,使其适合诸如数据生成和异常检测等任务。

Discriminative models create boundaries between classes which make them ideal for classification tasks. In contrast, generative models understand the underlying data distribution and generate new samples which make them suitable for tasks such as data generation and anomaly detection.

我们还解释了判别模型与生成模型之间的一些核心差异。这些差异使数据科学家和机器学习专家能够为特定任务选择最合适的方法,并提高机器学习系统的功效。

We also explained some core differences between discriminative and generative models. These differences empower data scientists and machine learning experts to choose the most suitable approach for specific tasks and enhance the efficacy of machine learning systems.