Plotly 简明教程

Plotly - Box Plot Violin Plot and Contour Plot

本章重点介绍了对包括箱形图、小提琴图、轮廓图和颤动图在内的各种图表的详细理解。最初,我们将从箱形图开始。

This chapter focusses on detail understanding about various plots including box plot, violin plot, contour plot and quiver plot. Initially, we will begin with the Box Plot follow.

Box Plot

箱形图显示了一组数据的摘要,包括最小值、 first quartile, median, third quartilemaximum 。在箱形图中,我们从第一个四分位数到第三个四分位数画一个框。一条垂直线在中位数处通过该框。从框垂直延伸出来的线表示上下四分位数外的可变性,称为晶须。因此,箱形图也称为箱形图和 whisker plot 。晶须从每个四分位数延伸到最小值或最大值。

A box plot displays a summary of a set of data containing the minimum, first quartile, median, third quartile, and maximum. In a box plot, we draw a box from the first quartile to the third quartile. A vertical line goes through the box at the median. The lines extending vertically from the boxes indicating variability outside the upper and lower quartiles are called whiskers. Hence, box plot is also known as box and whisker plot. The whiskers go from each quartile to the minimum or maximum.

box plot

要绘制箱形图,我们必须使用 go.Box() 函数。可以将数据序列分配给 x 或 y 参数。相应地,箱形图将水平或垂直绘制。在以下示例中,某公司不同分公司的销售数据转换成水平箱形图。它显示了最小值和最大值的中位数。

To draw Box chart, we have to use go.Box() function. The data series can be assigned to x or y parameter. Accordingly, the box plot will be drawn horizontally or vertically. In following example, sales figures of a certain company in its various branches is converted in horizontal box plot. It shows the median of minimum and maximum value.

trace1 = go.Box(y = [1140,1460,489,594,502,508,370,200])
data = [trace1]
fig = go.Figure(data)
iplot(fig)

输出如下所示:

The output of the same will be as follows −

boxpoints parameter

可以给 go.Box() 函数各种其他参数来控制箱形图的外观和行为。其中之一是 boxmean 参数。

The go.Box() function can be given various other parameters to control the appearance and behaviour of box plot. One such is boxmean parameter.

boxmean 参数默认设置为 true。结果,箱的基本分布的平均值在箱内绘制为虚线。如果将其设置为 sd,则还绘制分布的标准差。

The boxmean parameter is set to true by default. As a result, the mean of the boxes' underlying distribution is drawn as a dashed line inside the boxes. If it is set to sd, the standard deviation of the distribution is also drawn.

boxpoints 参数默认等于 " outliers "。仅显示晶须外的样本点。如果为 "suspectedoutliers",则显示异常点,并突出显示小于 4"Q1-3"Q3 或大于 4"Q3-3"Q1 的点。如果为 "False",则仅显示箱(es),而不显示样本点。

The boxpoints parameter is by default equal to "outliers". Only the sample points lying outside the whiskers are shown. If "suspectedoutliers", the outlier points are shown and points either less than 4"Q1-3"Q3 or greater than 4"Q3-3"Q1 are highlighted. If "False", only the box(es) are shown with no sample points.

在以下示例中, box trace 使用标准差和异常点绘制。

In the following example, the box trace is drawn with standard deviation and outlier points.

trc = go.Box(
   y = [
      0.75, 5.25, 5.5, 6, 6.2, 6.6, 6.80, 7.0, 7.2, 7.5, 7.5, 7.75, 8.15,
      8.15, 8.65, 8.93, 9.2, 9.5, 10, 10.25, 11.5, 12, 16, 20.90, 22.3, 23.25
   ],
   boxpoints = 'suspectedoutliers', boxmean = 'sd'
)
data = [trc]
fig = go.Figure(data)
iplot(fig)

输出如下所示:

The output of the same is stated below −

box trace

Violin Plot

小提琴图类似于箱形图,不同之处在于它们还显示了不同值处数据的概率密度。与标准箱形图一样,小提琴图将包括一个表示数据中位数的标记和一个表示四分位间范围的框。在这个箱形图上叠加的是核密度估计。与箱形图一样,小提琴图用于表示不同“类别”中变量分布(或样本分布)的比较。

Violin plots are similar to box plots, except that they also show the probability density of the data at different values. Violin plots will include a marker for the median of the data and a box indicating the interquartile range, as in standard box plots. Overlaid on this box plot is a kernel density estimation. Like box plots, violin plots are used to represent comparison of a variable distribution (or sample distribution) across different "categories".

小提琴图比普通箱形图更具信息性。事实上,虽然箱形图仅显示平均值/中位数和四分位数范围等汇总统计数据,但小提琴图显示了 full distribution of the data

A violin plot is more informative than a plain box plot. In fact, while a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data.

go.Violin() 函数在 graph_objects 模块中返回小提琴迹对象。为了显示基础箱形图,将 boxplot_visible 属性设置为 True。类似地,通过将 meanline_visible 属性设置为 true,可以在小提琴内部显示对应于样本平均值的线。

Violin trace object is returned by go.Violin() function in graph_objects module. In order to display underlying box plot, the boxplot_visible attribute is set to True. Similarly, by setting meanline_visible property to true, a line corresponding to the sample’s mean is shown inside the violins.

以下示例演示如何使用 Plotly 的功能显示小提琴图。

Following example demonstrates how Violin plot is displayed using plotly’s functionality.

import numpy as np
np.random.seed(10)
c1 = np.random.normal(100, 10, 200)
c2 = np.random.normal(80, 30, 200)
trace1 = go.Violin(y = c1, meanline_visible = True)
trace2 = go.Violin(y = c2, box_visible = True)
data = [trace1, trace2]
fig = go.Figure(data = data)
iplot(fig)

输出如下 −

The output is as follows −

violin plot

Contour plot

二维等值线图显示二维数值数组 z 的轮廓线,即 z 的 isovalues 插值线。双变量函数的等值线是函数具有常数值的曲线,使得曲线连接等值点。

A 2D contour plot shows the contour lines of a 2D numerical array z, i.e. interpolated lines of isovalues of z. A contour line of a function of two variables is a curve along which the function has a constant value, so that the curve joins points of equal value.

如果你想了解某个值 Z 如何随着两个输入值 XY 变化而变化(即 Z = f(X,Y) ),那么轮廓图非常合适。双变量函数的等值线或等值线是函数具有常数值的曲线。

A contour plot is appropriate if you want to see how some value Z changes as a function of two inputs, X and Y such that Z = f(X,Y). A contour line or isoline of a function of two variables is a curve along which the function has a constant value.

自变量 x 和 y 通常限制在一个称为网格的规则网格中。numpy.meshgrid 由一个 x 值数组和一个 y 值数组创建一个矩形网格。

The independent variables x and y are usually restricted to a regular grid called meshgrid. The numpy.meshgrid creates a rectangular grid out of an array of x values and an array of y values.

让我们首先使用 Numpy 库中的 linspace() 函数创建 x、y 和 z 的数据值。我们从 x 和 y 值创建 meshgrid ,并获取由 x2+y2 平方根组成的 z 数组

Let us first create data values for x, y and z using linspace() function from Numpy library. We create a meshgrid from x and y values and obtain z array consisting of square root of x2+y2

我们在 graph_objects 模块中有一个 go.Contour() 函数,它获取 x、 yz 属性。以下代码片段显示了上述计算的 x、 yz 值的轮廓图。

We have go.Contour() function in graph_objects module which takes x,y and z attributes. Following code snippet displays contour plot of x, y and z values computed as above.

import numpy as np
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
trace = go.Contour(x = xlist, y = ylist, z = Z)
data = [trace]
fig = go.Figure(data)
iplot(fig)

输出如下 −

The output is as follows −

contour plot

轮廓图可以通过一个或多个以下参数自定义 −

The contour plot can be customized by one or more of following parameters −

  1. Transpose (boolean) − Transposes the z data.

如果 xtype (或 ytype )等于“array”,则 x/y 坐标由“x”/“y”给出。如果“scaled”,则 x 坐标由“x0”和“ dx ”给出。

If xtype (or ytype) equals "array", x/y coordinates are given by "x"/"y". If "scaled", x coordinates are given by "x0" and "dx".

  1. The connectgaps parameter determines whether or not gaps in the z data are filled in.

  2. Default value of ncontours parameter is 15. The actual number of contours will be chosen automatically to be less than or equal to the value of ncontours. Has an effect only if autocontour is "True".

轮廓类型默认值为“ levels ”,因此数据表示为显示多级的轮廓图。如果 constrain ,数据将表示为约束,无效区域按照 operationvalue 参数指定的着色。

Contours type is by default: "levels" so the data is represented as a contour plot with multiple levels displayed. If constrain, the data is represented as constraints with the invalid region shaded as specified by the operation and value parameters.

showlines − 确定是否绘制轮廓线。

showlines − Determines whether or not the contour lines are drawn.

zauto 默认情况下为 True ,确定是否根据输入数据(此处为 z )来计算颜色域,或根据 zminzmax 中设置的边界。当 zminzmax 由用户设置时,默认为 False

zauto is True by default and determines whether or not the color domain is computed with respect to the input data (here in z) or the bounds set in zmin and zmax Defaults to False when zmin and zmax are set by the user.

Quiver plot

羽状图也称为 velocity plot 。它以箭头形式显示速度向量,箭头组件为 ( u,v ) 在点 (x,y)。为了绘制羽状图,我们将使用在 Plotly 的 figure_factory 模块中定义的 create_quiver() 函数。

Quiver plot is also known as velocity plot. It displays velocity vectors as arrows with components (u,v) at the points (x,y). In order to draw Quiver plot, we will use create_quiver() function defined in figure_factory module in Plotly.

Plotly 的 Python API 包含一个图形工厂模块,其中包含许多包装函数,这些函数创建了独特的图表类型,这些图表类型尚未包含在 plotly.js 中,Plotly 的开源绘图库。

Plotly’s Python API contains a figure factory module which includes many wrapper functions that create unique chart types that are not yet included in plotly.js, Plotly’s open-source graphing library.

create_quiver() 函数接受以下参数 −

The create_quiver() function accepts following parameters −

  1. x − x coordinates of the arrow locations

  2. y − y coordinates of the arrow locations

  3. u − x components of the arrow vectors

  4. v − y components of the arrow vectors

  5. scale − scales size of the arrows

  6. arrow_scale − length of arrowhead.

  7. angle − angle of arrowhead.

以下代码在 Jupyter notebook 中呈现一个简单的箭簇图 −

Following code renders a simple quiver plot in Jupyter notebook −

import plotly.figure_factory as ff
import numpy as np
x,y = np.meshgrid(np.arange(-2, 2, .2), np.arange(-2, 2, .25))
z = x*np.exp(-x**2 - y**2)
v, u = np.gradient(z, .2, .2)

# Create quiver figure
fig = ff.create_quiver(x, y, u, v,
scale = .25, arrow_scale = .4,
name = 'quiver', line = dict(width = 1))
iplot(fig)

代码的输出如下 −

Output of the code is as follows −

quiver plot