Gen-ai 简明教程
Generative AI Models - Maximum Likelihood Estimation
最大似然估计 (MLE) 是一种统计方法,它提供了一种主方法来估计最能描述给定数据集的概率分布的参数。MLE 假设指定分布生成数据。简单来说,MLE 是一种方法,用于找出模型的未知参数的最可能值,例如一组数据点的平均值或分布。这有点像我们猜测序列中缺失的数字,以便它与我们已知的数字模式相符。
Maximum Likelihood Estimation (MLE) is a statistical method providing a principal approach to estimate the parameters of a probability distribution that best describes a given dataset. MLE assumes that the specified distribution generates the data. In simple terms, MLE is a method used to find out the most likely values for the unknown parameters of a model, such as the average or spread of a set of data points. It is something like we guess the missing numbers in a sequence so that it fits the pattern of numbers we already know.
在生成式 AI 领域,尤其是在对抗生成网络 (GAN) 和变分自动编码器 (VAE) 等生成式模型中,MLE 找到了广泛的应用。例如,在生成手写数字 (0-9) 的图像时,我们希望我们的模型生成类似于我们数据集(例如 MNIST)中的图像。我们可以通过最大化在给定模型参数的情况下观察到我们训练数据的可能性来实现这一点。
In the field of generative AI, especially in generative models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), MLE finds extensive applications. For example, while generating images of handwritten digits (0-9), we want our model to generate images that resemble the ones in our dataset (like MNIST). We can achieve this by maximizing the likelihood of observing our training data given the parameters of the model.
maximize Σ log P(x | θ)
当我们使用 Python 编程语言创建第一个 GAN 模型时,我们稍后将详细介绍这一点。阅读本章节以了解最大似然估计的概念、它在生成式建模中的重要作用、MLE 在生成式建模中的应用以及它的 Python 实现。
We will cover this later in detail when we create our first GAN model using python programming language. Read this chapter to understand the concept of Maximum Likelihood Estimation, its significant role in generative modeling, applications of MLE in generative modeling, and its Python implementation.
Understanding Maximum Likelihood Estimation (MLE)
最大似然估计 (MLE) 是一种强大的统计方法,用于基于观察到的数据来估计概率分布的参数。让我们借助它的数学基础来更详细地了解一下它 -
Maximum Likelihood Estimation (MLE) is a powerful statistical method used to estimate the parameters of probability distributions based on observed data. Let’s understand it in more detail with the help of its mathematical foundations −
Mathematical Foundation of MLE
MLE 的核心在于似然函数:$\mathrm{L(\theta | x)}$。其中,$\mathrm{\theta}$ 表示分布的参数,x 表示观察到的数据。
At the heart of MLE lies the likelihood function: $\mathrm{L(\theta | x)}$. Here, $\mathrm{\theta}$ represents the parameters of the distribution, and x denotes the observed data.
似然函数量化了在给定特定参数值的情况下观察到数据的概率。在数学上,它表示为观察数据的联合概率密度函数 (PDF) 或概率质量函数 (PMF)。
The likelihood function quantifies the probability of observing the data given specific parameter values. Mathematically, it is expressed as the joint probability density function (PDF) or probability mass function (PMF) of the observed data.
\mathrm{L(\theta | x) \: = \: f(x | \theta)}
为了使计算简单,我们通常研究似然函数 $\mathrm{l(\theta | x)}$,这是似然函数的对数自然对数 −
To keep the computation simple, we usually work with the log-likelihood function $\mathrm{l(\theta | x)}$, which is the natural logarithm of the likelihood function −
\mathrm{l(\theta | x) \: = \: \log L(\theta | x)}
实际上,MLE 的目标是找到使似然函数 $\mathrm{L(\theta | x)}$ 或等价地对数似然函数 $\mathrm{l(\theta | x)}$ 最大化的参数值 $\mathrm{\hat{\theta}}$ −
Actually, the goal of MLE is to find the parameter values $\mathrm{\hat{\theta}}$ that maximize the likelihood function $\mathrm{L(\theta | x)}$ or equivalently, the log- likelihood function $\mathrm{l(\theta | x)}$ −
\mathrm{\hat{\theta} \: = \: argmax_{\theta} L(\theta | x)}
或者,
Or,
\mathrm{\hat{\theta} \: = \: argmax_{\theta} l(\theta | x)}
现在,为了获得最大似然估计 $\mathrm{\hat{\theta}}$,我们对数似然函数 $\mathrm{l(\theta | x)}$ 关于参数 $\mathrm{\theta}$ 求导数,并将导数设定为零 −
Now, to obtain the maximum likelihood estimates $\mathrm{\hat{\theta}}$, we differentiate the log- likelihood function $\mathrm{l(\theta | x)}$ with respect to the parameters $\mathrm{\theta}$ and set the derivatives equal to zero −
\mathrm{\frac{\partial \: l(\theta | x)}{\partial \: \theta} \: = \: 0}
求解上述公式给出 MLE $\mathrm{\hat{\theta}}$。
Solving the above equation gives the MLE $\mathrm{\hat{\theta}}$.
MLE in Generative Modeling
正如我们之前讨论过的,生成建模涉及捕获数据的底层分布并生成可与原始训练数据相媲美的新数据。在训练生成模型时,MLE 通过估计底层概率分布的参数发挥着至关重要的作用。
Generative modeling, as we discussed earlier, involves capturing the underlying distribution of data and generates new data comparable to the original training data. In training generative models, MLE plays a crucial role by estimating the parameters of the underlying probability distribution.
让我们看看 MLE 如何应用于生成建模 −
Let’s see how MLE is applied in generative modeling −
Model Selection
我们首先需要选择一个捕获数据底层分布的概率模型。一些常用模型为高斯分布、混合模型、神经网络等。
We first need to choose a probabilistic model that captures the underlying data distribution. Some of the common models are Gaussian distributions, mixture models, neural networks, etc.
Likelihood Function
接下来我们需要定义似然函数。此似然函数测量观测给定数据的概率。例如,对于给定的数据集 $\mathrm{D \: = \: \lbrace x_{1},x_{2},x_{3},\: \dots \: x_{n} \rbrace}$,似然函数 $\mathrm{L(\theta | D)}$ 取决于模型参数 $\mathrm{\theta}$,并给出观测每个数据点的概率乘积 −
Next we need to define the likelihood function. This likelihood function measures the probability of observing the given data. For example, for a given dataset $\mathrm{D \: = \: \lbrace x_{1},x_{2},x_{3},\: \dots \: x_{n} \rbrace}$, the likelihood function $\mathrm{L(\theta | D)}$ depends on the model parameter $\mathrm{\theta}$ and is given by the product of the probabilities of observing each data point −
\mathrm{L(\theta|D) \: = \: \prod_{i=1}^N p(x_i | \theta)}
\mathrm{L(\theta | D) \: = \: \prod_{i=1}^N p(x_{i} | \theta)}
Maximization
现在我们需要相对于模型参数\(\theta\) 最大化似然函数。最大化涉及找到\(\theta\)的值,从而在模型下观测数据最可能。
Now we need to maximize the likelihood function with respect to the model parameters $\mathrm{\theta}$. Maximization involves finding the values of $\mathrm{\theta}$ that make the observed data most likely under the model.
Parameter Estimation
最后,当似然函数最大化时,产生的参数值被用作生成模型的参数估计。这些估计的参数定义了学习分布,然后可用于生成可与观测数据相比拟的新数据点。
Finally, when the likelihood function is maximized, the resulting parameter values are used as the estimates for the parameters of the generative model. These estimated parameters define the learned distribution, which can then be used to generate new data points comparable to the observed data.
Applications of MLE in Generative Modeling
MLE 在生成建模的各个领域都有广泛的应用。下面给出了一些重要的应用程序 -
MLE is having wide-range applications across various domains of generative modeling. Given below are some of the significant applications −
-
Gaussian Mixture Models (GMMs) − MLE is used to estimate the parameters of Gaussian components in GMMs. These parameters enable the modeling of complex data distributions with multiple modes.
-
Variational Autoencoders (VAEs) − In VAEs, MLE is used to learn the parameters of the latent variable distribution. It allows the model to generate new data samples by sampling from this learned distribution.
-
Generative Adversarial Networks (GANs) − The GANs do not directly optimize the likelihood function but MLE are used in the training of GANs to guide the learning process and improve sample quality.
Implementing Maximum Likelihood Estimation using Python
我们可以使用 Python 实施 MLE,并使用 Matplotlib 等库对其进行可视化。下面是一个从给定数据集中估计高斯分布的参数的简单 MLE 执行示例 −
We can implement MLE using Python and visualize it using libraries like Matplotlib. Given below is a simple example to perform MLE to estimate the parameters of a Gaussian distribution from a given dataset −
Example
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
# Sample dataset (you can replace this with your own data)
data = np.random.normal(loc=2, scale=1, size=2000)
# Maximum Likelihood Estimation for a Gaussian distribution
def maximum_likelihood_estimation(data):
# Calculate the mean and standard deviation of the data
mu = np.mean(data)
sigma = np.std(data)
return mu, sigma
# Perform Maximum Likelihood Estimation
estimated_mu, estimated_sigma = maximum_likelihood_estimation(data)
# Generate x values for plotting
x = np.linspace(min(data), max(data), 1000)
# Plot histogram of the data
plt.figure(figsize=(7.2, 5.5))
plt.hist(data, bins=30, density=True, alpha=0.6, color='blue', label='Data Histogram')
# Plot the true Gaussian distribution
plt.plot(x, norm.pdf(x, loc=2, scale=1), color='red', linestyle='--', label='True Gaussian Distribution')
# Plot the estimated Gaussian distribution using MLE
plt.plot(x, norm.pdf(x, loc=estimated_mu, scale=estimated_sigma), color='green', linestyle='-', label='Estimated Gaussian Distribution (MLE)')
plt.xlabel('Value')
plt.ylabel('Probability Density')
plt.title('Maximum Likelihood Estimation for Gaussian Distribution')
plt.legend()
plt.grid(True)
plt.show()
上述代码会生成一个图,其中显示数据的直方图、真实的高斯分布和使用最大似然估计 (MLE) 获得的估计高斯分布。
The above code will produce a plot showing the histogram of the data, the true Gaussian distribution, and the estimated Gaussian distribution obtained using Maximum Likelihood Estimation (MLE).
Conclusion
在本章中,我们强调了 MLE 在生成建模中的重要性。在生成建模中,MLE 是用于学习数据分布和生成新样本的主干。
In this chapter, we emphasized MLE’s significance in generative modeling. In generative modeling, MLE serves as the backbone for learning data distributions and generating new samples.
模型选择、似然函数、最大化和参数估计是我们可以借助其在生成建模中应用 MLE 的步骤。
Model Selection, Likelihood Function, Maximization, and Parameter Estimation are the steps with the help of which we can apply MLE in generative modeling.