Seaborn 简明教程
Seaborn - Plotting Categorical Data
在我们先前的章节中,我们了解了用于分析所研究的连续变量的散点图、六角形图和核密度估计图。当所研究的变量是分类变量时,这些图不适用。
In our previous chapters we learnt about scatter plots, hexbin plots and kde plots which are used to analyze the continuous variables under study. These plots are not suitable when the variable under study is categorical.
当一个或两个所研究的变量是分类变量时,我们将使用 striplot()、swarmplot() 等图。Seaborn 提供了这么做的接口。
When one or both the variables under study are categorical, we use plots like striplot(), swarmplot(), etc,. Seaborn provides interface to do so.
Categorical Scatter Plots
在本节中,我们将了解分类散点图。
In this section, we will learn about categorical scatter plots.
stripplot()
在所研究变量之一为分类变量时使用 stripplot()。它以分类顺序沿任意一个轴表示数据。
stripplot() is used when one of the variable under study is categorical. It represents the data in sorted order along any one of the axis.
Example
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.stripplot(x = "species", y = "petal_length", data = df)
plt.show()
Output

在上图中,我们可以清楚地看到每个物种中 petal_length 的差异。但是,上述散点图的主要问题是散点图上的点重叠在一起。我们使用“抖动”参数来处理这种情况。
In the above plot, we can clearly see the difference of petal_length in each species. But, the major problem with the above scatter plot is that the points on the scatter plot are overlapped. We use the ‘Jitter’ parameter to handle this kind of scenario.
抖动会向数据添加一些随机噪声。此参数将调整沿分类轴的位置。
Jitter adds some random noise to the data. This parameter will adjust the positions along the categorical axis.
Example
import pandas as pd
import seaborn as sb
from matplotlib import pyplot as plt
df = sb.load_dataset('iris')
sb.stripplot(x = "species", y = "petal_length", data = df, jitter = Ture)
plt.show()
Swarmplot()
可以在 swarmplot() 函数中使用另一个选项作为“抖动”的替代。此函数将散点图的每个点定位在分类轴上,从而避免重叠点−
Another option which can be used as an alternate to ‘Jitter’ is function swarmplot(). This function positions each point of scatter plot on the categorical axis and thereby avoids overlapping points −