Plotly 简明教程

Plotly - Quick Guide

Plotly - Introduction

Plotly 是一家位于蒙特利尔的科技公司,致力于开发诸如 DashChart Studio 一类的数据分析和可视化工具。此外,该公司还为 Python、R、MATLAB、JavaScript 和其他计算机编程语言开发了开源绘图应用程序编程接口 (API) 库。

Plotly is a Montreal based technical computing company involved in development of data analytics and visualisation tools such as Dash and Chart Studio. It has also developed open source graphing Application Programming Interface (API) libraries for Python, R, MATLAB, Javascript and other computer programming languages.

Plotly 的某些 important features 如下 -

Some of the important features of Plotly are as follows −

  1. It produces interactive graphs.

  2. The graphs are stored in JavaScript Object Notation (JSON) data format so that they can be read using scripts of other programming languages such as R, Julia, MATLAB etc.

  3. Graphs can be exported in various raster as well as vector image formats

Plotly - Environment Setup

本章重点介绍了如何在 Plotly 的帮助下完成 Python 中的环境设置。

This chapter focusses on how to do the environmental set up in Python with the help of Plotly.

Installation of Python package

强烈建议使用 Python 的虚拟环境功能来安装新包。以下命令在指定的文件夹中创建了一个虚拟环境。

It is always recommended to use Python’s virtual environment feature for installation of a new package. Following command creates a virtual environment in the specified folder.

python -m myenv

要激活创建的虚拟环境,请在 bin 子文件夹中运行 activate 脚本,如下所示。

To activate the so created virtual environment run activate script in bin sub folder as shown below.

source bin/activate

现在,我们可以使用 pip 实用工具如下安装 plotly 的 Python 包。

Now we can install plotly’s Python package as given below using pip utility.

pip install plotly

你可能还希望安装 Jupyter notebook 应用程序,该应用程序是 Ipython 解释器的基于 Web 的接口。

You may also want to install Jupyter notebook app which is a web based interface to Ipython interpreter.

pip install jupyter notebook

首先,你需要在以下网站上创建一个帐户: https://plot.ly 。你可以使用此处提及的链接 https://plot.ly/api_signup 注册,然后成功登录。

Firstly, you need to create an account on website which is available at https://plot.ly. You can sign up by using the link mentioned herewith https://plot.ly/api_signup and then log in successfully.

sign in page

接下来,从仪表板的设置页面获取 API 密钥。

Next, obtain the API key from settings page of your dashboard.

settings page

使用你的用户名和 API 密钥设置 Python interpreter 会话中的凭据。

Use your username and API key to set up credentials on Python interpreter session.

import plotly
plotly.tools.set_credentials_file(username='test',
api_key='********************')

一个名为 credentials 的特殊文件在你的主目录下的 .plotly subfolder 中创建。它看起来类似于以下内容:

A special file named credentials is created in .plotly subfolder under your home directory. It looks similar to the following −

{
   "username": "test",
   "api_key": "********************",
   "proxy_username": "",
   "proxy_password": "",
   "stream_ids": []
}

为了生成图表,我们需要从 plotly 包中导入以下模块:

In order to generate plots, we need to import the following module from plotly package −

import plotly.plotly as py
import plotly.graph_objs as go

plotly.plotly module 包含有助于与 Plotly 服务器通信的函数。 plotly.graph_objs module 中的函数生成图形对象

plotly.plotly module contains the functions that will help us communicate with the Plotly servers. Functions in plotly.graph_objs module generates graph objects

Plotly - Online and Offline Plotting

下一章介绍了在线和离线绘制的设置。让我们首先了解在线绘图的设置。

The following chapter deals with the settings for the online and offline plotting. Let us first study the settings for online plotting.

Settings for online plotting

Datagraph 联机图保存在您的 plot.ly account 中。联机图通过两种方法创建,它们两个都创建一个独特的 url 用作该图并将其保存在您的 Plotly 帐户中。

Data and graph of online plot are save in your plot.ly account. Online plots are generated by two methods both of which create a unique url for the plot and save it in your Plotly account.

  1. py.plot() − returns the unique url and optionally open the url.

  2. py.iplot() − when working in a Jupyter Notebook to display the plot in the notebook.

我们现在将在 radians vs. its sine value 中显示一个简单的角度图。首先,使用 numpy 库中的 arange() 函数获得 0 到 2π 之间的角度 ndarray 对象。此 ndarray 对象用作图形 x axis 上的值。需要显示在 y axis 上的角度的对应正弦值通过以下陈述获得 −

We shall now display simple plot of angle in radians vs. its sine value. First, obtain ndarray object of angles between 0 and 2π using arange() function from numpy library. This ndarray object serves as values on x axis of the graph. Corresponding sine values of angles in x which has to be displayed on y axis are obtained by following statements −

import numpy as np
import math #needed for definition of pi
xpoints = np.arange(0, math.pi*2, 0.05)
ypoints = np.sin(xpoints)

接下来,使用 graph_objs module 中的 Scatter() 函数创建一个散布轨迹。

Next, create a scatter trace using Scatter() function in graph_objs module.

trace0 = go.Scatter(
   x = xpoints,
   y = ypoints
)
data = [trace0]

将上述列表对象用作 plot() 函数的参数。

Use above list object as argument to plot() function.

py.plot(data, filename = 'Sine wave', auto_open=True)

将以下脚本保存为 plotly1.py

Save following script as plotly1.py

import plotly
plotly.tools.set_credentials_file(username='lathkar', api_key='********************')
import plotly.plotly as py
import plotly.graph_objs as go
import numpy as np
import math #needed for definition of pi

xpoints = np.arange(0, math.pi*2, 0.05)
ypoints = np.sin(xpoints)
trace0 = go.Scatter(
   x = xpoints, y = ypoints
)
data = [trace0]
py.plot(data, filename = 'Sine wave', auto_open=True)

从命令行执行上述脚本。生成的图形将按如下所示在浏览器中以指定的 URL 显示。

Execute the above mentioned script from command line. Resultant plot will be displayed in the browser at specified URL as stated below.

$ python plotly1.py
High five! You successfully sent some data to your account on plotly.
View your plot in your browser at https://plot.ly/~lathkar/0
plot graph

就在显示的图形上方,您将发现标签 Plot、Data、Python 和 Rand Forking History。

Just above the displayed graph, you will find tabs Plot, Data, Python & Rand Forking history.

当前, Plot tab 被选中。Data 标签显示包含 x 和 y 数据点的网格。从 Python 和 R 标签,您可以在 Python、R、JSON、Matlab 等中查看与当前图形对应的代码。以下快照显示了上面生成的图形的 Python 代码 −

Currently, Plot tab is selected. The Data tab shows a grid containing x and y data points. From Python & R tab, you can view code corresponding to current plot in Python, R, JSON, Matlab etc. Following snapshot shows Python code for the plot as generated above −

python code

Setting for Offline Plotting

Plotly 允许您生成离线图形并将其保存在本地机器上。 plotly.offline.plot() 函数会创建独立的 HTML,将其本地保存并在您的 Web 浏览器中打开。

Plotly allows you to generate graphs offline and save them in local machine. The plotly.offline.plot() function creates a standalone HTML that is saved locally and opened inside your web browser.

Jupyter Notebook 中进行离线工作时,使用 plotly.offline.iplot() 在笔记本中显示图形。

Use plotly.offline.iplot() when working offline in a Jupyter Notebook to display the plot in the notebook.

Note − 离线绘图需要 Plotly 的 1.9.4+

Note − Plotly’s version 1.9.4+ is needed for offline plotting.

更改脚本中的 plot() function 语句并运行。一个名为 temp-plot.html 的 HTML 文件将在本地创建并在 Web 浏览器中打开。

Change plot() function statement in the script and run. A HTML file named temp-plot.html will be created locally and opened in web browser.

plotly.offline.plot(
   { "data": data,"layout": go.Layout(title = "hello world")}, auto_open = True)
offline plotting

Plotly - Plotting Inline with Jupyter Notebook

在本章中,我们将学习如何在 Jupyter Notebook 中执行内嵌绘图。

In this chapter, we will study how to do inline plotting with the Jupyter Notebook.

为了在笔记本中显示绘图,你需要按照如下所示初始化 Plotly 的笔记本模式 -

In order to display the plot inside the notebook, you need to initiate plotly’s notebook mode as follows −

from plotly.offline import init_notebook_mode
init_notebook_mode(connected = True)

将其余脚本保持原样,并按下 Shift+Enter 运行笔记本单元格。绘图将在笔记本内部脱机显示。

Keep rest of the script as it is and run the notebook cell by pressing Shift+Enter. Graph will be displayed offline inside the notebook itself.

import plotly
plotly.tools.set_credentials_file(username = 'lathkar', api_key = '************')
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)

import plotly
import plotly.graph_objs as go
import numpy as np
import math #needed for definition of pi

xpoints = np.arange(0, math.pi*2, 0.05)
ypoints = np.sin(xpoints)
trace0 = go.Scatter(
   x = xpoints, y = ypoints
)
data = [trace0]
plotly.offline.iplot({ "data": data,"layout": go.Layout(title="Sine wave")})

Jupyter Notebook 的输出如下所示 -

Jupyter notebook output will be as shown below −

jupyter notebook

绘图输出显示 tool bar 中的 top right 。它包含用作 png, zoom in and out, box and lasso, select and hover. 下载的按钮

The plot output shows a tool bar at top right. It contains buttons for download as png, zoom in and out, box and lasso, select and hover.

tool bar

Plotly - Package Structure

Plotly Python 程序包包含三个主要模块,如下所示 −

Plotly Python package has three main modules which are given below −

  1. plotly.plotly

  2. plotly.graph_objs

  3. plotly.tools

plotly.plotly module 包含需要 Plotly 服务器响应的函数。此模块中的函数是在本地机器和 Plotly 之间的接口。

The plotly.plotly module contains functions that require a response from Plotly’s servers. Functions in this module are interface between your local machine and Plotly.

plotly.graph_objs module 是最重要的模块,它包含构成所看图形的对象的所有类定义。定义了如下的图形对象 −

The plotly.graph_objs module is the most important module that contains all of the class definitions for the objects that make up the plots you see. Following graph objects are defined −

  1. Figure,

  2. Data,

  3. ayout,

  4. Different graph traces like Scatter, Box, Histogram etc.

plotly module

所有图形对象都是用于生成和/或修改 Plotly 图形的每个功能的字典类和列表类对象。

All graph objects are dictionary- and list-like objects used to generate and/or modify every feature of a Plotly plot.

plotly.tools module 包含许多有助于简化和增强 Plotly 体验的函数。此模块中定义了用于 subplot generation 、在 IPython notebooks 中嵌入 Plotly 图形、保存和检索凭据的函数。

The plotly.tools module contains many helpful functions facilitating and enhancing the Plotly experience. Functions for subplot generation, embedding Plotly plots in IPython notebooks, saving and retrieving your credentials are defined in this module.

图形由 plotly.graph_objs module 中定义的 Figure 类表示的 Figure 对象表示。其构造函数需要以下参数 −

A plot is represented by Figure object which represents Figure class defined in plotly.graph_objs module. It’s constructor needs following parameters −

import plotly.graph_objs as go
fig = go.Figure(data, layout, frames)

data 参数是在 Python 中的列表对象。它是想要绘制的所有轨迹的列表。轨迹只是我们给要绘制的数据集合起的名字。 trace 对象根据想要在绘制表面上显示数据的方式命名。

The data parameter is a list object in Python. It is a list of all the traces that you wish to plot. A trace is just the name we give to a collection of data which is to be plotted. A trace object is named according to how you want the data displayed on the plotting surface.

Plotly 提供许多轨迹对象,例如 scatter, bar, pie, heatmap 等。每个轨迹都由 graph_objs 函数中的相应函数返回。例如: go.scatter() 返回散点轨迹。

Plotly provides number of trace objects such as scatter, bar, pie, heatmap etc. and each is returned by respective functions in graph_objs functions. For example: go.scatter() returns a scatter trace.

import numpy as np
import math #needed for definition of pi

xpoints=np.arange(0, math.pi*2, 0.05)
ypoints=np.sin(xpoints)

trace0 = go.Scatter(
   x = xpoints, y = ypoints
)
data = [trace0]

layout 参数定义了图形的外观和与数据无关的图表功能。因此我们将能够更改标题、轴标题、注释、图例、间距、字体甚至在图表顶部绘制形状等内容。

The layout parameter defines the appearance of the plot, and plot features which are unrelated to the data. So we will be able to change things like the title, axis titles, annotations, legends, spacing, font and even draw shapes on top of your plot.

layout = go.Layout(title = "Sine wave", xaxis = {'title':'angle'}, yaxis = {'title':'sine'})

图形可以有 plot title 以及轴 title 。它也可以有注释来表示其他描述。

A plot can have plot title as well as axis title. It also may have annotations to indicate other descriptions.

最后,有一个由 go.Figure() function 创建的 Figure object 。它是一个类似字典的对象,包含数据对象和布局对象。最后,将绘制图形对象。

Finally, there is a Figure object created by go.Figure() function. It is a dictionary-like object that contains both the data object and the layout object. The figure object is eventually plotted.

py.iplot(fig)

Plotly - Exporting to Static Images

离线图形的输出可以导出为各种光栅和矢量图像格式。为此,我们需要安装两个依赖关系 – orcapsutil

Outputs of offline graphs can be exported to various raster and vector image formats. For that purpose, we need to install two dependencies – orca and psutil.

Orca

Orca 代表 Open-source Report Creator App 。它是一个 Electron 应用,用于从命令行生成 plotly 图形、控制面板应用和控制面板的图像和报告。Orca 是 Plotly 的 Image Server 的支柱。

Orca stands for Open-source Report Creator App. It is an Electron app that generates images and reports of plotly graphs, dash apps, dashboards from the command line. Orca is the backbone of Plotly’s Image Server.

psutil

psutil (python system and process utilities) 是一个跨平台库,用于在 Python 中检索正在运行的进程和系统利用率的信息。它实现了 UNIX 命令行工具提供的许多功能,如: ps, top, netstat, ifconfig, who, 等。psutil 支持所有主要的运营系统,如 Linux、Windows 和 MacOs

psutil (python system and process utilities) is a cross-platform library for retrieving information on running processes and system utilization in Python. It implements many functionalities offered by UNIX command line tools such as: ps, top, netstat, ifconfig, who, etc. psutil supports all major operating systems such as Linux, Windows and MacOs

Installation of Orca and psutil

如果您使用的是 Python 的 Anaconda 发行版,通过 conda package manager 可以非常容易地安装 orca 和 psutil,如下所示 −

If you are using Anaconda distribution of Python, installation of orca and psutil is very easily done by conda package manager as follows −

conda install -c plotly plotly-orca psutil

因为 orca 在 PyPi 存储库中不可用。您可以使用 npm utility 来替代安装它。

Since, orca is not available in PyPi repository. You can instead use npm utility to install it.

npm install -g electron@1.8.4 orca

使用 pip 安装 psutil

Use pip to install psutil

pip install psutil

如果你无法使用 npm 或 conda,也可以从以下可通过 https://github.com/plotly/orca/releases. 获得的网站上下载 orca 的预构建二进制文件

If you are not able to use npm or conda, prebuilt binaries of orca can also be downloaded from the following website which is available at https://github.com/plotly/orca/releases.

要以 png、jpg 或 WebP 格式导出 Figure 对象,首先导入 plotly.io 模块

To export Figure object to png, jpg or WebP format, first, import plotly.io module

import plotly.io as pio

现在,我们可以按如下所示调用 write_image() 函数 −

Now, we can call write_image() function as follows −

pio.write_image(fig, ‘sinewave.png’)
pio.write_image(fig, ‘sinewave.jpeg’)
pio.write_image(fig,’sinewave.webp)

orca 工具还支持将 plotly 导出为 svg、pdf 和 eps 格式。

The orca tool also supports exporting plotly to svg, pdf and eps formats.

Pio.write_image(fig, ‘sinewave.svg’)
pio.write_image(fig, ‘sinewave.pdf’)

Jupyter notebook 中,通过 pio.to_image() 函数获得的图像对象可以按如下所示内联显示 −

In Jupyter notebook, the image object obtained by pio.to_image() function can be displayed inline as follows −

jupyter notebook image

Plotly - Legends

默认情况下,具有多个迹线的 Plotly 图表会自动显示图例。如果它只有一个迹线,则不会自动显示。若要显示,请将 Layout 对象的 showlegend 参数设置为 True。

By default, Plotly chart with multiple traces shows legends automatically. If it has only one trace, it is not displayed automatically. To display, set showlegend parameter of Layout object to True.

layout = go.Layoyt(showlegend = True)

图例的默认标签是迹线对象名称。若要显式设置图例标签,请设置迹线的名称属性。

Default labels of legends are trace object names. To set legend label explicitly set name property of trace.

在以下示例中,绘制了两个带有名称属性的散点迹线。

In following example, two scatter traces with name property are plotted.

import numpy as np
import math #needed for definition of pi

xpoints = np.arange(0, math.pi*2, 0.05)
y1 = np.sin(xpoints)
y2 = np.cos(xpoints)
trace0 = go.Scatter(
   x = xpoints,
   y = y1,
   name='Sine'
)
trace1 = go.Scatter(
   x = xpoints,
   y = y2,
   name = 'cos'
)
data = [trace0, trace1]
layout = go.Layout(title = "Sine and cos", xaxis = {'title':'angle'}, yaxis = {'title':'value'})
fig = go.Figure(data = data, layout = layout)
iplot(fig)

该图外观如下 −

The plot appears as below −

legends trace object

Plotly - Format Axis and Ticks

您可以通过指定线条宽度和颜色来配置每个轴的外观。也可以定义网格宽度和网格颜色。让我们在这章中详细了解相同的内容。

You can configure appearance of each axis by specifying the line width and color. It is also possible to define grid width and grid color. Let us learn about the same in detail in this chapter.

Plot with Axis and Tick

在布局对象的属性中,将 showticklabels 设置为 true 将启用刻度。tickfont 属性是一个 dict 对象,用于指定字体名称、大小、颜色等。tickmode 属性有两个可能的值 - 线性和数组。如果是线性,则起始刻度的坐标由 tick0 确定,而刻度之间的步长由 dtick 属性确定。

In the Layout object’s properties, setting showticklabels to true will enable ticks. The tickfont property is a dict object specifying font name, size, color, etc. The tickmode property can have two possible values — linear and array. If it is linear, the position of starting tick is determined by tick0 and step between ticks by dtick properties.

如果将 tickmode 设置为数组,则必须提供值的列表,并将其作为 tickvalticktext 属性。

If tickmode is set to array, you have to provide list of values and labels as tickval and ticktext properties.

布局对象还具有 Exponentformat 属性,将其设置为 ‘e’ 将导致刻度值以科学记数法显示。您还需要将 showexponent 属性设置为 ‘all’

The Layout object also has Exponentformat attribute set to ‘e’ will cause tick values to be displayed in scientific notation. You also need to set showexponent property to ‘all’.

现在我们在上述示例中设置布局对象,通过指定线、网格和标题字体属性以及刻度模式、值和字体来配置 x 和 y axis

We now format the Layout object in above example to configure x and y axis by specifying line, grid and title font properties and tick mode, values and font.

layout = go.Layout(
   title = "Sine and cos",
   xaxis = dict(
      title = 'angle',
      showgrid = True,
      zeroline = True,
      showline = True,
      showticklabels = True,
      gridwidth = 1
   ),
   yaxis = dict(
      showgrid = True,
      zeroline = True,
      showline = True,
      gridcolor = '#bdbdbd',
      gridwidth = 2,
      zerolinecolor = '#969696',
      zerolinewidth = 2,
      linecolor = '#636363',
      linewidth = 2,
      title = 'VALUE',
      titlefont = dict(
         family = 'Arial, sans-serif',
         size = 18,
         color = 'lightgrey'
      ),
      showticklabels = True,
      tickangle = 45,
      tickfont = dict(
      family = 'Old Standard TT, serif',
      size = 14,
      color = 'black'
      ),
      tickmode = 'linear',
      tick0 = 0.0,
      dtick = 0.25
   )
)
plot with axis and tick

Plot with Multiple Axes

有时,在图形中同时使用 x or y axes 很有用;例如,在用不同单位绘制曲线时。Matplotlib 通过 twinxtwiny 函数支持此功能。在以下示例中,绘图具有 dual y axes ,一个显示 exp(x) ,另一个显示 log(x)

Sometimes it is useful to have dual x or y axes in a figure; for example, when plotting curves with different units together. Matplotlib supports this with the twinx and twiny functions. In the following example, the plot has dual y axes, one showing exp(x) and other showing log(x)

x = np.arange(1,11)
y1 = np.exp(x)
y2 = np.log(x)
trace1 = go.Scatter(
   x = x,
   y = y1,
   name = 'exp'
)
trace2 = go.Scatter(
   x = x,
   y = y2,
   name = 'log',
   yaxis = 'y2'
)
data = [trace1, trace2]
layout = go.Layout(
   title = 'Double Y Axis Example',
   yaxis = dict(
      title = 'exp',zeroline=True,
      showline = True
   ),
   yaxis2 = dict(
      title = 'log',
      zeroline = True,
      showline = True,
      overlaying = 'y',
      side = 'right'
   )
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)

这里,附加的 y 轴配置为 yaxis2 ,并出现在右侧,标题为 ‘log’ 。生成的绘图如下 −

Here, additional y axis is configured as yaxis2 and appears on right side, having ‘log’ as title. Resultant plot is as follows −

plot with multiple axes

Plotly - Subplots and Inset Plots

在此,我们将理解 Plotly 中子图和插图的概念。

Here, we will understand the concept of subplots and inset plots in Plotly.

Making Subplots

有时候,并排比较数据的不同视图很有用。它支持子图的概念。它在 plotly.tools module 中提供 make_subplots() 函数。该函数返回一个 Figure 对象。

Sometimes it is helpful to compare different views of data side by side. This supports the concept of subplots. It offers make_subplots() function in plotly.tools module. The function returns a Figure object.

以下语句在一个行中创建两个子图。

The following statement creates two subplots in one row.

fig = tools.make_subplots(rows = 1, cols = 2)

现在,我们可以将两个不同的轨迹(上面的示例中的 exp 和 log 轨迹)添加到该图形。

We can now add two different traces (the exp and log traces in example above) to the figure.

fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 1, 2)

图表的布局通过使用 update() 方法指定 title, width, height, 等进一步配置。

The Layout of figure is further configured by specifying title, width, height, etc. using update() method.

fig['layout'].update(height = 600, width = 800s, title = 'subplots')

完整脚本如下−

Here’s the complete script −

from plotly import tools
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)
import numpy as np
x = np.arange(1,11)
y1 = np.exp(x)
y2 = np.log(x)
trace1 = go.Scatter(
   x = x,
   y = y1,
   name = 'exp'
)
trace2 = go.Scatter(
   x = x,
   y = y2,
   name = 'log'
)
fig = tools.make_subplots(rows = 1, cols = 2)
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 1, 2)
fig['layout'].update(height = 600, width = 800, title = 'subplot')
iplot(fig)

这是绘图网格的格式:[(1,1) x1,y1][(1,2) x2,y2]

This is the format of your plot grid: [ (1,1) x1,y1 ] [ (1,2) x2,y2 ]

making subplots

Inset Plots

若要将子图显示为插图,我们需要配置其轨迹对象。首先,将插图轨迹的 xaxis 和 yaxis 属性分别配置为 ‘x2’‘y2’ 。以下语句将 ‘log’ 轨迹放入插图。

To display a subplot as inset, we need to configure its trace object. First the xaxis and yaxis properties of inset trace to ‘x2’ and ‘y2’ respectively. Following statement puts ‘log’ trace in inset.

trace2 = go.Scatter(
   x = x,
   y = y2,
   xaxis = 'x2',
   yaxis = 'y2',
   name = 'log'
)

其次,配置布局对象,其中插图的 x 和 y 轴位置由 domain 属性定义,该属性指定其与主轴相关的位置。

Secondly, configure Layout object where the location of x and y axes of inset is defined by domain property that specifies is position with respective to major axis.

xaxis2=dict(
   domain = [0.1, 0.5],
   anchor = 'y2'
),
yaxis2 = dict(
   domain = [0.5, 0.9],
   anchor = 'x2'
)

在插图中显示 log 轨迹并在主轴上显示 exp 轨迹的完整脚本如下:

Complete script to display log trace in inset and exp trace on main axis is given below −

trace1 = go.Scatter(
   x = x,
   y = y1,
   name = 'exp'
)
trace2 = go.Scatter(
   x = x,
   y = y2,
   xaxis = 'x2',
   yaxis = 'y2',
   name = 'log'
)
data = [trace1, trace2]
layout = go.Layout(
   yaxis = dict(showline = True),
   xaxis2 = dict(
      domain = [0.1, 0.5],
      anchor = 'y2'
   ),
   yaxis2 = dict(
      showline = True,
      domain = [0.5, 0.9],
      anchor = 'x2'
   )
)
fig = go.Figure(data=data, layout=layout)
iplot(fig)

输出如下:

The output is mentioned below −

inset plots

Plotly - Bar Chart and Pie Chart

在本章中,我们将学习如何借助 Plotly 制作条形图和饼图。让我们从了解条形图开始。

In this chapter, we will learn how to make bar and pie charts with the help of Plotly. Let us begin by understanding about bar chart.

Bar Chart

条形图使用与其所表示的值成比例的矩形条以显示分类数据,这些条形图具有高度或长度。条形图可以垂直或水平显示。它有助于显示离散类别之间的比较。图表的一条轴显示正在比较的特定类别,另一条轴表示测量值。

A bar chart presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. Bars can be displayed vertically or horizontally. It helps to show comparisons among discrete categories. One axis of the chart shows the specific categories being compared, and the other axis represents a measured value.

以下示例针对不同课程注册的学生人数绘制了一个简单的 bar chartgo.Bar() 函数返回一个条形轨迹,其中 x 坐标设置为科目列表,y 坐标设置为学生人数。

Following example plots a simple bar chart about number of students enrolled for different courses. The go.Bar() function returns a bar trace with x coordinate set as list of subjects and y coordinate as number of students.

import plotly.graph_objs as go
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
data = [go.Bar(
   x = langs,
   y = students
)]
fig = go.Figure(data=data)
iplot(fig)

输出将如下所示 −

The output will be as shown below −

bar chart

要显示分组条形图,必须将 Layout 对象的 barmode 属性设置为 group 。在以下代码中,表示每个年级学生的多个轨迹相对于科目进行绘制,并显示为分组条形图。

To display a grouped bar chart, the barmode property of Layout object must be set to group. In the following code, multiple traces representing students in each year are plotted against subjects and shown as grouped bar chart.

branches = ['CSE', 'Mech', 'Electronics']
fy = [23,17,35]
sy = [20, 23, 30]
ty = [30,20,15]
trace1 = go.Bar(
   x = branches,
   y = fy,
   name = 'FY'
)
trace2 = go.Bar(
   x = branches,
   y = sy,
   name = 'SY'
)
trace3 = go.Bar(
   x = branches,
   y = ty,
   name = 'TY'
)
data = [trace1, trace2, trace3]
layout = go.Layout(barmode = 'group')
fig = go.Figure(data = data, layout = layout)
iplot(fig)

其输出如下 −

The output of the same is as follows −

grouped bar chart

barmode 属性确定在图形上如何显示具有相同位置坐标的条形图。定义的值为“堆叠”(条形图彼此堆叠),“相对”(条形图彼此堆叠,负值在轴下方,正值在轴上方),“ group ”(条形图彼此相邻绘制)。

The barmode property determines how bars at the same location coordinate are displayed on the graph. Defined values are "stack" (bars stacked on top of one another), "relative", (bars are stacked on top of one another, with negative values below the axis, positive values above), "group" (bars plotted next to one another).

通过将 barmode 属性更改为“ stack ”,绘制的图形如下所示 −

By changing barmode property to ‘stack’ the plotted graph appears as below −

stack plotted graph

Pie chart

饼图仅显示一个数据序列。 Pie Charts 显示一个数据序列中项目(称为 wedge )的大小,它与项目的总和成比例。数据点显示为整个饼图的百分比。

A Pie Chart displays only one series of data. Pie Charts show the size of items (called wedge) in one data series, proportional to the sum of the items. Data points are shown as a percentage of the whole pie.

graph_objs 模块中的 pie() 函数 - go.Pie() 返回一个饼图轨迹。两个必需参数是 labelsvalues 。让我们绘制一个简单的语言课程与学生人数的饼图,如以下示例所示。

The pie() function in graph_objs module – go.Pie(), returns a Pie trace. Two required arguments are labels and values. Let us plot a simple pie chart of language courses vs number of students as in the example given herewith.

import plotly
plotly.tools.set_credentials_file(
   username = 'lathkar', api_key = 'U7vgRe1hqmRp4ZNf4PTN'
)
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)
import plotly.graph_objs as go
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
trace = go.Pie(labels = langs, values = students)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

在 Jupyter Notebook 中显示如下输出 -

Following output is displayed in Jupyter notebook −

pie chart

Donut chart 是一个圆心带一个圆孔的饼图,这使它看起来像一个甜甜圈。在以下示例中,两个甜甜圈图以 1X2 网格布局显示。虽然“ label ”布局对两个饼形迹线相同,但每一部分情节的行和列目的地由域属性决定。

Donut chart is a pie chart with a round hole in the center which makes it look like a donut. In the following example, two donut charts are displayed in 1X2 grid layout. While ‘label’ layout is same for both pie traces, row and column destination of each subplot is decided by domain property.

为此,我们使用 2019 年议会选举中按党派划分的席位和得票率数据。在 Jupyter Notebook 单元格中输入以下代码 -

For this purpose, we use the data of party-wise seats and vote share in 2019 parliamentary elections. Enter the following code in Jupyter notebook cell −

parties = ['BJP', 'CONGRESS', 'DMK', 'TMC', 'YSRC', 'SS', 'JDU','BJD', 'BSP','OTH']
seats = [303,52,23,22,22,18,16,12,10, 65]
percent = [37.36, 19.49, 2.26, 4.07, 2.53, 2.10, 1.46, 1.66, 3.63, 25.44]
import plotly.graph_objs as go
data1 = {
   "values": seats,
   "labels": parties,
   "domain": {"column": 0},
   "name": "seats",
   "hoverinfo":"label+percent+name",
   "hole": .4,
   "type": "pie"
}
data2 = {
   "values": percent,
   "labels": parties,
   "domain": {"column": 1},
   "name": "vote share",
   "hoverinfo":"label+percent+name",
   "hole": .4,
   "type": "pie"
}
data = [data1,data2]
layout = go.Layout(
   {
      "title":"Parliamentary Election 2019",
      "grid": {"rows": 1, "columns": 2},
      "annotations": [
         {
            "font": {
               "size": 20
            },
            "showarrow": False,
            "text": "seats",
            "x": 0.20,
            "y": 0.5
         },
         {
            "font": {
               "size": 20
            },
            "showarrow": False,
            "text": "votes",
            "x": 0.8,
            "y": 0.5
         }
      ]
   }
)
fig = go.Figure(data = data, layout = layout)
iplot(fig)

如下给出相同代码的输出结果 -

The output of the same is given below −

donut chart

Scatter Plot, Scattergl Plot and Bubble Charts

本章重点介绍散点图、Scattergl 图和气泡图的详细信息。首先,让我们探讨散点图。

This chapter emphasizes on details about Scatter Plot, Scattergl Plot and Bubble Charts. First, let us study about Scatter Plot.

Scatter Plot

散点图用于将点 plot data 在水平轴和垂直轴上,以展示一个变量如何影响另一个变量。数据表中的每一行都用一个标记表示,其位置取决于其在 XY 轴上设置的列中的值。

Scatter plots are used to plot data points on a horizontal and a vertical axis to show how one variable affects another. Each row in the data table is represented by a marker whose position depends on its values in the columns set on the X and Y axes.

graph_objs 模块 (go.Scatter)scatter() 方法产生一个散点轨迹。此处, mode 属性决定了数据点的外观。模式的默认值为 lines,显示连接数据点的连续线。如果设置为 markers ,则只显示由小空心圆点表示的数据点。当模式指定为“lines+markers”时,则显示圆形和线。

The scatter() method of graph_objs module (go.Scatter) produces a scatter trace. Here, the mode property decides the appearance of data points. Default value of mode is lines which displays a continuous line connecting data points. If set to markers, only the data points represented by small filled circles are displayed. When mode is assigned ‘lines+markers’, both circles and lines are displayed.

在以下示例中,绘制了笛卡尔坐标系中三组随机生成点的散点轨迹。下面解释了每条轨迹显示的不同的模式属性。

In the following example, plots scatter traces of three sets of randomly generated points in Cartesian coordinate system. Each trace displayed with different mode property is explained below.

import numpy as np
N = 100
x_vals = np.linspace(0, 1, N)
y1 = np.random.randn(N) + 5
y2 = np.random.randn(N)
y3 = np.random.randn(N) - 5
trace0 = go.Scatter(
   x = x_vals,
   y = y1,
   mode = 'markers',
   name = 'markers'
)
trace1 = go.Scatter(
   x = x_vals,
   y = y2,
   mode = 'lines+markers',
   name = 'line+markers'
)
trace2 = go.Scatter(
   x = x_vals,
   y = y3,
   mode = 'lines',
   name = 'line'
)
data = [trace0, trace1, trace2]
fig = go.Figure(data = data)
iplot(fig)

Jupyter notebook cell 的输出如下所示:

The output of Jupyter notebook cell is as given below −

jupyter notebook cell

Scattergl Plot

WebGL (网络图形库)是一个 JavaScript API,用于在任何兼容的网络浏览器中呈现交互式 2D3D graphics ,而无需使用插件。WebGL 与其他网络标准完全集成,允许图形处理单元 (GPU) 加速使用图像处理。

WebGL (Web Graphics Library) is a JavaScript API for rendering interactive 2D and 3D graphics within any compatible web browser without the use of plug-ins. WebGL is fully integrated with other web standards, allowing Graphics Processing Unit (GPU) accelerated usage of image processing.

Plotly 您可以使用 Scattergl() 代替 Scatter() 实现 WebGL,以提高速度、改善交互性,并绘制更多数据的能力。 go.scattergl() 函数在涉及大量数据点时可以提供更好的性能。

Plotly you can implement WebGL with Scattergl() in place of Scatter() for increased speed, improved interactivity, and the ability to plot even more data. The go.scattergl() function which gives better performance when a large number of data points are involved.

import numpy as np
N = 100000
x = np.random.randn(N)
y = np.random.randn(N)
   trace0 = go.Scattergl(
   x = x, y = y, mode = 'markers'
)
data = [trace0]
layout = go.Layout(title = "scattergl plot ")
fig = go.Figure(data = data, layout = layout)
iplot(fig)

输出如下:

The output is mentioned below −

scattergl plot

Bubble charts

气泡图显示数据的三个维度。具有关联数据的三个维度的每个实体均绘为一个 disk (气泡),它通过圆盘的 xy location 表示两个维度,并通过其大小表示第三个维度。气泡的大小由第三个数据序列中的值决定。

A bubble chart displays three dimensions of data. Each entity with its three dimensions of associated data is plotted as a disk (bubble) that expresses two of the dimensions through the disk’s xy location and the third through its size. The sizes of the bubbles are determined by the values in the third data series.

Bubble chart 是散点图的一种变化形式,其中数据点用气泡代替。如果你的数据包含三个维度,如下所示,则创建一个气泡图是一个不错的选择。

Bubble chart is a variation of the scatter plot, in which the data points are replaced with bubbles. If your data has three dimensions as shown below, creating a Bubble chart will be a good choice.

Company

Products

Sale

Share

A

13

2354

23

B

6

5423

47

C

23

2451

30

气泡图用 go.Scatter() 迹生成。以上两个数据序列中的 productssale 用 作 xy 属性,而 market share 作为 marker size

Bubble chart is produced with go.Scatter() trace. Two of the above data series are given as x and y properties. Third dimension is shown by marker with its size representing third data series. In the above mentioned case, we use products and sale as x and y properties and market share as marker size.

在 Jupyter 笔记本中输入以下代码。

Enter the following code in Jupyter notebook.

company = ['A','B','C']
products = [13,6,23]
sale = [2354,5423,4251]
share = [23,47,30]
fig = go.Figure(data = [go.Scatter(
   x = products, y = sale,
   text = [
      'company:'+c+' share:'+str(s)+'%'
      for c in company for s in share if company.index(c)==share.index(s)
   ],
   mode = 'markers',
   marker_size = share, marker_color = ['blue','red','yellow'])
])
iplot(fig)

输出应如下所示 −

The output would be as shown below −

bubble chart

Plotly - Dot Plots and Table

此处,我们将了解 Plotly 中的点图和 table 函数。首先,让我们从点图开始。

Here, we will learn about dot plots and table function in Plotly. Firstly, let us start with dot plots.

Dot Plots

点图在非常简单的刻度上显示点。它仅适合于少量的点,因为大量的点会使点图显得过于杂乱不清。点图还被称为 Cleveland dot plots 。它们显示两个(或更多个)时间点或两个(或更多个)条件之间的变化。

A dot plot displays points on a very simple scale. It is only suitable for a small amount of data as a large number of points will make it look very cluttered. Dot plots are also known as Cleveland dot plots. They show changes between two (or more) points in time or between two (or more) conditions.

点图类似于水平条形图。然而,它们不会显得过于杂乱不清,并且允许更容易地在条件之间进行比较。该图形绘制了散点迹线,并将 mode 属性设置为 markers。

Dot plots are similar to horizontal bar chart. However, they can be less cluttered and allow an easier comparison between conditions. The figure plots a scatter trace with mode attribute set to markers.

以下示例显示了在印度独立后每次人口普查中记录的男性和女性之间的识字率比较。图表中的两条迹线表示 1951 年至 2011 年期间每次人口普查中的男性和女性识字率百分比。

Following example shows comparison of literacy rate amongst men and women as recorded in each census after independence of India. Two traces in the graph represent literacy percentage of men and women in each census after 1951 up to 2011.

from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)
census = [1951,1961,1971,1981,1991,2001, 2011]
x1 = [8.86, 15.35, 21.97, 29.76, 39.29, 53.67, 64.63]
x2 = [27.15, 40.40, 45.96, 56.38,64.13, 75.26, 80.88]
traceA = go.Scatter(
   x = x1,
   y = census,
   marker = dict(color = "crimson", size = 12),
   mode = "markers",
   name = "Women"
)
traceB = go.Scatter(
x = x2,
y = census,
marker = dict(color = "gold", size = 12),
mode = "markers",
name = "Men")
data = [traceA, traceB]
layout = go.Layout(
   title = "Trend in Literacy rate in Post independent India",
   xaxis_title = "percentage",
   yaxis_title = "census"
)
fig = go.Figure(data = data, layout = layout)
iplot(fig)

输出应如下所示 −

The output would be as shown below −

cleveland dot plots

Table in Plotly

Plotly 的 Table 对象由 go.Table() 函数返回。表格迹线是一种图形对象,可用于以行和列的网格方式查看详细数据。表格使用按列优先的顺序,即网格表示为列向量的向量。

Plotly’s Table object is returned by go.Table() function. Table trace is a graph object useful for detailed data viewing in a grid of rows and columns. Table is using a column-major order, i.e. the grid is represented as a vector of column vectors.

go.Table() 函数的两个重要参数是 header ,即表格的第一行,以及 cells ,即其余行。两个参数均为字典对象。标题的 values 属性是列标题的列表,以及列表的列表,每个列表对应于一行。

Two important parameters of go.Table() function are header which is the first row of table and cells which form rest of rows. Both parameters are dictionary objects. The values attribute of headers is a list of column headings, and a list of lists, each corresponding to one row.

通过 linecolor、fill_color、font 和其他属性进一步定制样式。

Further styling customization is done by linecolor, fill_color, font and other attributes.

以下代码显示了最近结束的 2019 年板球世界杯小组循环赛的积分表。

Following code displays the points table of round robin stage of recently concluded Cricket World Cup 2019.

trace = go.Table(
   header = dict(
      values = ['Teams','Mat','Won','Lost','Tied','NR','Pts','NRR'],
      line_color = 'gray',
      fill_color = 'lightskyblue',
      align = 'left'
   ),
   cells = dict(
      values =
      [
         [
            'India',
            'Australia',
            'England',
            'New Zealand',
            'Pakistan',
            'Sri Lanka',
            'South Africa',
            'Bangladesh',
            'West Indies',
            'Afghanistan'
         ],
         [9,9,9,9,9,9,9,9,9,9],
         [7,7,6,5,5,3,3,3,2,0],
         [1,2,3,3,3,4,5,5,6,9],
         [0,0,0,0,0,0,0,0,0,0],
         [1,0,0,1,1,2,1,1,1,0],
         [15,14,12,11,11,8,7,7,5,0],
         [0.809,0.868,1.152,0.175,-0.43,-0.919,-0.03,-0.41,-0.225,-1.322]
      ],
      line_color='gray',
      fill_color='lightcyan',
      align='left'
   )
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

输出如下所述 −

The output is as mentioned below −

表格数据还可以从 Pandas 数据框中填充。我们按如下所示创建一个逗号分隔文件 ( points-table.csv ) −

Table data can also be populated from Pandas dataframe. Let us create a comma separated file (points-table.csv) as below −

Teams

Mat

Won

Lost

Tied

NR

Pts

NRR

India

9

7

1

0

1

15

0.809

Australia

9

7

2

0

0

14

0.868

England

9

6

3

0

0

14

1.152

New Zealand

9

5

3

0

1

11

0.175

Pakistan

9

5

3

0

1

11

-0.43

Sri Lanka

9

3

4

0

2

8

-0.919

South Africa

9

3

5

0

1

7

-0.03

Bangladesh

9

3

5

0

1

7

-0.41

Teams,Matches,Won,Lost,Tie,NR,Points,NRR
India,9,7,1,0,1,15,0.809
Australia,9,7,2,0,0,14,0.868
England,9,6,3,0,0,12,1.152
New Zealand,9,5,3,0,1,11,0.175
Pakistan,9,5,3,0,1,11,-0.43
Sri Lanka,9,3,4,0,2,8,-0.919
South Africa,9,3,5,0,1,7,-0.03
Bangladesh,9,3,5,0,1,7,-0.41
West Indies,9,2,6,0,1,5,-0.225
Afghanistan,9,0,9,0,0,0,-1.322

我们现在从此 csv 文件中构建一个数据框对象,并使用它按如下所示构建表格迹线 −

We now construct a dataframe object from this csv file and use it to construct table trace as below −

import pandas as pd
df = pd.read_csv('point-table.csv')
trace = go.Table(
   header = dict(values = list(df.columns)),
   cells = dict(
      values = [
         df.Teams,
         df.Matches,
         df.Won,
         df.Lost,
         df.Tie,
         df.NR,
         df.Points,
         df.NRR
      ]
   )
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

Plotly - Histogram

直方图由卡尔·皮尔森引入,是数字数据分布的准确表示,即连续变量概率分布的估计值(CORAL)。它看起来类似于条形图,但是,条形图关联两个变量,而直方图仅关联一个变量。

Introduced by Karl Pearson, a histogram is an accurate representation of the distribution of numerical data which is an estimate of the probability distribution of a continuous variable (CORAL). It appears similar to bar graph, but, a bar graph relates two variables, whereas a histogram relates only one.

直方图需要 bin (或 bucket ),它将整个值范围分成一系列区间,然后统计落在每个区间中的值的个数。这些直方通常指定为变量的连续、不重叠的区间。这些直方必须相邻,并且通常具有相同的大小。在直方上面竖起一个矩形,其高度与频率成正比,即每个直方中的情况数。

A histogram requires bin (or bucket) which divides the entire range of values into a series of intervals—and then count how many values fall into each interval. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins must be adjacent, and are often of equal size. A rectangle is erected over the bin with height proportional to the frequency—the number of cases in each bin.

Plotly 函数返回直方图跟踪对象。它的定制由各种参数或属性完成。一个基本参数是 x 或 y,设置到以下列表中: numpy arrayPandas dataframe object ,它要以 bin 方式分布。

Histogram trace object is returned by go.Histogram() function. Its customization is done by various arguments or attributes. One essential argument is x or y set to a list, numpy array or Pandas dataframe object which is to be distributed in bins.

默认情况下,Plotly 以自动调整大小的 bin 方式分布数据点。但是,您可以定义自定义 bin 大小。为此,您需要将 autobins 设为 false,指定 nbins (bin 的数量)、它的起始值、结束值和大小。

By default, Plotly distributes the data points in automatically sized bins. However, you can define custom bin size. For that you need to set autobins to false, specify nbins (number of bins), its start and end values and size.

下列代码生成一个简单的直方图,在 bin 中显示一班学生成绩分布(自动调整大小)−

Following code generates a simple histogram showing distribution of marks of students in a class inbins (sized automatically) −

import numpy as np
x1 = np.array([22,87,5,43,56,73,55,54,11,20,51,5,79,31,27])
data = [go.Histogram(x = x1)]
fig = go.Figure(data)
iplot(fig)

输出如下所示:

The output is as shown below −

histnorm

go.Histogram() 函数接受 histnorm ,它指定用于此直方图跟踪的正态化类型。默认值是 “”,每个条的范围对应于出现的次数(即位于 bin 内的数据点的数量)。如果将它赋值给 "percent" / "probability" ,则每个条的范围对应于相对于样本点总数出现的百分比/分数。如果它等于 “ density ”,则每个条的范围对应于位于 bin 中出现的次数除以 bin 间隔的大小。

The go.Histogram() function accepts histnorm, which specifies the type of normalization used for this histogram trace. Default is "", the span of each bar corresponds to the number of occurrences (i.e. the number of data points lying inside the bins). If assigned "percent" / "probability", the span of each bar corresponds to the percentage / fraction of occurrences with respect to the total number of sample points. If it is equal to "density", the span of each bar corresponds to the number of occurrences in a bin divided by the size of the bin interval.

此外还有 histfunc 参数,其默认值是 count 。结果,位于 bin 上的矩形的高度对应于数据点的计数。它可以设为 sum、avg、min 或 max。

There is also histfunc parameter whose default value is count. As a result, height of rectangle over a bin corresponds to count of data points. It can be set to sum, avg, min or max.

可以将 histogram() 函数设为显示连续 bin 中值的累积分布。为此,您需要将 cumulative property 设为 enabled。结果如下所示 −

The histogram() function can be set to display cumulative distribution of values in successive bins. For that, you need to set cumulative property to enabled. Result can be seen as below −

data=[go.Histogram(x = x1, cumulative_enabled = True)]
fig = go.Figure(data)
iplot(fig)

输出如下所述 −

The output is as mentioned below −

cumulative property

Plotly - Box Plot Violin Plot and Contour Plot

本章重点介绍了对包括箱形图、小提琴图、轮廓图和颤动图在内的各种图表的详细理解。最初,我们将从箱形图开始。

This chapter focusses on detail understanding about various plots including box plot, violin plot, contour plot and quiver plot. Initially, we will begin with the Box Plot follow.

Box Plot

箱形图显示了一组数据的摘要,包括最小值、 first quartile, median, third quartilemaximum 。在箱形图中,我们从第一个四分位数到第三个四分位数画一个框。一条垂直线在中位数处通过该框。从框垂直延伸出来的线表示上下四分位数外的可变性,称为晶须。因此,箱形图也称为箱形图和 whisker plot 。晶须从每个四分位数延伸到最小值或最大值。

A box plot displays a summary of a set of data containing the minimum, first quartile, median, third quartile, and maximum. In a box plot, we draw a box from the first quartile to the third quartile. A vertical line goes through the box at the median. The lines extending vertically from the boxes indicating variability outside the upper and lower quartiles are called whiskers. Hence, box plot is also known as box and whisker plot. The whiskers go from each quartile to the minimum or maximum.

box plot

要绘制箱形图,我们必须使用 go.Box() 函数。可以将数据序列分配给 x 或 y 参数。相应地,箱形图将水平或垂直绘制。在以下示例中,某公司不同分公司的销售数据转换成水平箱形图。它显示了最小值和最大值的中位数。

To draw Box chart, we have to use go.Box() function. The data series can be assigned to x or y parameter. Accordingly, the box plot will be drawn horizontally or vertically. In following example, sales figures of a certain company in its various branches is converted in horizontal box plot. It shows the median of minimum and maximum value.

trace1 = go.Box(y = [1140,1460,489,594,502,508,370,200])
data = [trace1]
fig = go.Figure(data)
iplot(fig)

输出如下所示:

The output of the same will be as follows −

boxpoints parameter

可以给 go.Box() 函数各种其他参数来控制箱形图的外观和行为。其中之一是 boxmean 参数。

The go.Box() function can be given various other parameters to control the appearance and behaviour of box plot. One such is boxmean parameter.

boxmean 参数默认设置为 true。结果,箱的基本分布的平均值在箱内绘制为虚线。如果将其设置为 sd,则还绘制分布的标准差。

The boxmean parameter is set to true by default. As a result, the mean of the boxes' underlying distribution is drawn as a dashed line inside the boxes. If it is set to sd, the standard deviation of the distribution is also drawn.

boxpoints 参数默认等于 " outliers "。仅显示晶须外的样本点。如果为 "suspectedoutliers",则显示异常点,并突出显示小于 4"Q1-3"Q3 或大于 4"Q3-3"Q1 的点。如果为 "False",则仅显示箱(es),而不显示样本点。

The boxpoints parameter is by default equal to "outliers". Only the sample points lying outside the whiskers are shown. If "suspectedoutliers", the outlier points are shown and points either less than 4"Q1-3"Q3 or greater than 4"Q3-3"Q1 are highlighted. If "False", only the box(es) are shown with no sample points.

在以下示例中, box trace 使用标准差和异常点绘制。

In the following example, the box trace is drawn with standard deviation and outlier points.

trc = go.Box(
   y = [
      0.75, 5.25, 5.5, 6, 6.2, 6.6, 6.80, 7.0, 7.2, 7.5, 7.5, 7.75, 8.15,
      8.15, 8.65, 8.93, 9.2, 9.5, 10, 10.25, 11.5, 12, 16, 20.90, 22.3, 23.25
   ],
   boxpoints = 'suspectedoutliers', boxmean = 'sd'
)
data = [trc]
fig = go.Figure(data)
iplot(fig)

输出如下所示:

The output of the same is stated below −

box trace

Violin Plot

小提琴图类似于箱形图,不同之处在于它们还显示了不同值处数据的概率密度。与标准箱形图一样,小提琴图将包括一个表示数据中位数的标记和一个表示四分位间范围的框。在这个箱形图上叠加的是核密度估计。与箱形图一样,小提琴图用于表示不同“类别”中变量分布(或样本分布)的比较。

Violin plots are similar to box plots, except that they also show the probability density of the data at different values. Violin plots will include a marker for the median of the data and a box indicating the interquartile range, as in standard box plots. Overlaid on this box plot is a kernel density estimation. Like box plots, violin plots are used to represent comparison of a variable distribution (or sample distribution) across different "categories".

小提琴图比普通箱形图更具信息性。事实上,虽然箱形图仅显示平均值/中位数和四分位数范围等汇总统计数据,但小提琴图显示了 full distribution of the data

A violin plot is more informative than a plain box plot. In fact, while a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data.

go.Violin() 函数在 graph_objects 模块中返回小提琴迹对象。为了显示基础箱形图,将 boxplot_visible 属性设置为 True。类似地,通过将 meanline_visible 属性设置为 true,可以在小提琴内部显示对应于样本平均值的线。

Violin trace object is returned by go.Violin() function in graph_objects module. In order to display underlying box plot, the boxplot_visible attribute is set to True. Similarly, by setting meanline_visible property to true, a line corresponding to the sample’s mean is shown inside the violins.

以下示例演示如何使用 Plotly 的功能显示小提琴图。

Following example demonstrates how Violin plot is displayed using plotly’s functionality.

import numpy as np
np.random.seed(10)
c1 = np.random.normal(100, 10, 200)
c2 = np.random.normal(80, 30, 200)
trace1 = go.Violin(y = c1, meanline_visible = True)
trace2 = go.Violin(y = c2, box_visible = True)
data = [trace1, trace2]
fig = go.Figure(data = data)
iplot(fig)

输出如下 −

The output is as follows −

violin plot

Contour plot

二维等值线图显示二维数值数组 z 的轮廓线,即 z 的 isovalues 插值线。双变量函数的等值线是函数具有常数值的曲线,使得曲线连接等值点。

A 2D contour plot shows the contour lines of a 2D numerical array z, i.e. interpolated lines of isovalues of z. A contour line of a function of two variables is a curve along which the function has a constant value, so that the curve joins points of equal value.

如果你想了解某个值 Z 如何随着两个输入值 XY 变化而变化(即 Z = f(X,Y) ),那么轮廓图非常合适。双变量函数的等值线或等值线是函数具有常数值的曲线。

A contour plot is appropriate if you want to see how some value Z changes as a function of two inputs, X and Y such that Z = f(X,Y). A contour line or isoline of a function of two variables is a curve along which the function has a constant value.

自变量 x 和 y 通常限制在一个称为网格的规则网格中。numpy.meshgrid 由一个 x 值数组和一个 y 值数组创建一个矩形网格。

The independent variables x and y are usually restricted to a regular grid called meshgrid. The numpy.meshgrid creates a rectangular grid out of an array of x values and an array of y values.

让我们首先使用 Numpy 库中的 linspace() 函数创建 x、y 和 z 的数据值。我们从 x 和 y 值创建 meshgrid ,并获取由 x2+y2 平方根组成的 z 数组

Let us first create data values for x, y and z using linspace() function from Numpy library. We create a meshgrid from x and y values and obtain z array consisting of square root of x2+y2

我们在 graph_objects 模块中有一个 go.Contour() 函数,它获取 x、 yz 属性。以下代码片段显示了上述计算的 x、 yz 值的轮廓图。

We have go.Contour() function in graph_objects module which takes x,y and z attributes. Following code snippet displays contour plot of x, y and z values computed as above.

import numpy as np
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
trace = go.Contour(x = xlist, y = ylist, z = Z)
data = [trace]
fig = go.Figure(data)
iplot(fig)

输出如下 −

The output is as follows −

contour plot

轮廓图可以通过一个或多个以下参数自定义 −

The contour plot can be customized by one or more of following parameters −

  1. Transpose (boolean) − Transposes the z data.

如果 xtype (或 ytype )等于“array”,则 x/y 坐标由“x”/“y”给出。如果“scaled”,则 x 坐标由“x0”和“ dx ”给出。

If xtype (or ytype) equals "array", x/y coordinates are given by "x"/"y". If "scaled", x coordinates are given by "x0" and "dx".

  1. The connectgaps parameter determines whether or not gaps in the z data are filled in.

  2. Default value of ncontours parameter is 15. The actual number of contours will be chosen automatically to be less than or equal to the value of ncontours. Has an effect only if autocontour is "True".

轮廓类型默认值为“ levels ”,因此数据表示为显示多级的轮廓图。如果 constrain ,数据将表示为约束,无效区域按照 operationvalue 参数指定的着色。

Contours type is by default: "levels" so the data is represented as a contour plot with multiple levels displayed. If constrain, the data is represented as constraints with the invalid region shaded as specified by the operation and value parameters.

showlines − 确定是否绘制轮廓线。

showlines − Determines whether or not the contour lines are drawn.

zauto 默认情况下为 True ,确定是否根据输入数据(此处为 z )来计算颜色域,或根据 zminzmax 中设置的边界。当 zminzmax 由用户设置时,默认为 False

zauto is True by default and determines whether or not the color domain is computed with respect to the input data (here in z) or the bounds set in zmin and zmax Defaults to False when zmin and zmax are set by the user.

Quiver plot

羽状图也称为 velocity plot 。它以箭头形式显示速度向量,箭头组件为 ( u,v ) 在点 (x,y)。为了绘制羽状图,我们将使用在 Plotly 的 figure_factory 模块中定义的 create_quiver() 函数。

Quiver plot is also known as velocity plot. It displays velocity vectors as arrows with components (u,v) at the points (x,y). In order to draw Quiver plot, we will use create_quiver() function defined in figure_factory module in Plotly.

Plotly 的 Python API 包含一个图形工厂模块,其中包含许多包装函数,这些函数创建了独特的图表类型,这些图表类型尚未包含在 plotly.js 中,Plotly 的开源绘图库。

Plotly’s Python API contains a figure factory module which includes many wrapper functions that create unique chart types that are not yet included in plotly.js, Plotly’s open-source graphing library.

create_quiver() 函数接受以下参数 −

The create_quiver() function accepts following parameters −

  1. x − x coordinates of the arrow locations

  2. y − y coordinates of the arrow locations

  3. u − x components of the arrow vectors

  4. v − y components of the arrow vectors

  5. scale − scales size of the arrows

  6. arrow_scale − length of arrowhead.

  7. angle − angle of arrowhead.

以下代码在 Jupyter notebook 中呈现一个简单的箭簇图 −

Following code renders a simple quiver plot in Jupyter notebook −

import plotly.figure_factory as ff
import numpy as np
x,y = np.meshgrid(np.arange(-2, 2, .2), np.arange(-2, 2, .25))
z = x*np.exp(-x**2 - y**2)
v, u = np.gradient(z, .2, .2)

# Create quiver figure
fig = ff.create_quiver(x, y, u, v,
scale = .25, arrow_scale = .4,
name = 'quiver', line = dict(width = 1))
iplot(fig)

代码的输出如下 −

Output of the code is as follows −

quiver plot

Plotly - Distplots Density Plot and Error Bar Plot

在本章中,我们将详细了解 distplot、密度图和误差棒图。让我们从了解 distplot 开始。

In this chapter, we will understand about distplots, density plot and error bar plot in detail. Let us begin by learning about distplots.

Distplots

distplot 图表工厂显示数值数据的统计表示组合形式,例如直方图、核密度估计或正态曲线,以及 rug 图。

The distplot figure factory displays a combination of statistical representations of numerical data, such as histogram, kernel density estimation or normal curve, and rug plot.

distplot 可以由以下 3 个组件的全部或任意组合组成 −

The distplot can be composed of all or any combination of the following 3 components −

  1. histogram

  2. curve: (a) kernel density estimation or (b) normal curve, and

  3. rug plot

figure_factory 模块具有 create_distplot() 函数,需要称为 hist_data 的强制参数。

The figure_factory module has create_distplot() function which needs a mandatory parameter called hist_data.

以下代码创建一个基本 distplot,它包含一个直方图、一个 kde 图和一个 rug 图。

Following code creates a basic distplot consisting of a histogram, a kde plot and a rug plot.

x = np.random.randn(1000)
hist_data = [x]
group_labels = ['distplot']
fig = ff.create_distplot(hist_data, group_labels)
iplot(fig)

以上提到的这段代码的输出内容如下 −

The output of the code mentioned above is as follows −

distplots

Density Plot

密度图是根据数据估计出的直方图的平滑连续版本。最常见的估计形式称为 kernel density estimation (KDE) 。在此方法中,在每个单独数据点上绘制一个连续曲线(核),然后将所有这些曲线加在一起以进行单次平滑密度估计。

A density plot is a smoothed, continuous version of a histogram estimated from the data. The most common form of estimation is known as kernel density estimation (KDE). In this method, a continuous curve (the kernel) is drawn at every individual data point and all of these curves are then added together to make a single smooth density estimation.

模块 plotly.figure_factory._2d_density 中的 create_2d_density() 函数返回用于二维密度图的图表对象。

The create_2d_density() function in module plotly.figure_factory._2d_density returns a figure object for a 2D density plot.

以下代码用于根据直方图数据生成二维密度图。

Following code is used to produce 2D Density plot over histogram data.

t = np.linspace(-1, 1.2, 2000)
x = (t**3) + (0.3 * np.random.randn(2000))
y = (t**6) + (0.3 * np.random.randn(2000))
fig = ff.create_2d_density( x, y)
iplot(fig)

下面提及的内容是以上给出的代码的输出。

Below mentioned is the output of the above given code.

density plot

Error Bar Plot

误差棒是数据中误差或不确定性的图形化表示,它们有助于正确的解释。出于科学目的,理解给出数据时报告误差至关重要。

Error bars are graphical representations of the error or uncertainty in data, and they assist correct interpretation. For scientific purposes, reporting of errors is crucial in understanding the given data.

误差棒对解决问题者十分有用,因为误差棒显示出一组测量或计算值中的置信度或精确度。

Error bars are useful to problem solvers because error bars show the confidence or precision in a set of measurements or calculated values.

错误线通常表示数据集的范围和标准差。它们可以帮助显示数据如何围绕平均值分布。可以在各种绘图上生成错误线,例如条形图、折线图、散点图等。

Mostly error bars represent range and standard deviation of a dataset. They can help visualize how the data is spread around the mean value. Error bars can be generated on variety of plots such as bar plot, line plot, scatter plot etc.

go.Scatter() 函数具有 error_xerror_y 属性,用于控制如何生成错误线。

The go.Scatter() function has error_x and error_y properties that control how error bars are generated.

  1. visible (boolean) − Determines whether or not this set of error bars is visible.

Type 属性具有可能取值 " percent " | " constant " | " sqrt " | data ”。它设置用于生成错误线的规则。如果为 "percent",则条形长度对应于基础数据的百分比。在 value 中设置此百分比。如果为 "sqrt",则条形长度对应于基础数据的平方。如果为 "data",则条形长度使用数据集 array 设置。

Type property has possible values "percent" | "constant" | "sqrt" | "data”. It sets the rule used to generate the error bars. If "percent", the bar lengths correspond to a percentage of underlying data. Set this percentage in value. If "sqrt", the bar lengths correspond to the square of the underlying data. If "data", the bar lengths are set with data set array.

  1. symmetric property can be true or false. Accordingly, the error bars will have the same length in both direction or not (top/bottom for vertical bars, left/right for horizontal bars.

  2. array − sets the data corresponding the length of each error bar. Values are plotted relative to the underlying data.

  3. arrayminus − Sets the data corresponding the length of each error bar in the bottom (left) direction for vertical (horizontal) bars Values are plotted relative to the underlying data.

以下代码在散点图上显示对称错误线 −

Following code displays symmetric error bars on a scatter plot −

trace = go.Scatter(
   x = [0, 1, 2], y = [6, 10, 2],
   error_y = dict(
   type = 'data', # value of error bar given in data coordinates
   array = [1, 2, 3], visible = True)
)
data = [trace]
layout = go.Layout(title = 'Symmetric Error Bar')
fig = go.Figure(data = data, layout = layout)
iplot(fig)

下面给出上述代码的输出。

Given below is the output of the above stated code.

error bar plot

通过以下脚本渲染非对称错误图 −

Asymmetric error plot is rendered by following script −

trace = go.Scatter(
   x = [1, 2, 3, 4],
   y =[ 2, 1, 3, 4],
   error_y = dict(
      type = 'data',
      symmetric = False,
      array = [0.1, 0.2, 0.1, 0.1],
      arrayminus = [0.2, 0.4, 1, 0.2]
   )
)
data = [trace]
layout = go.Layout(title = 'Asymmetric Error Bar')
fig = go.Figure(data = data, layout = layout)
iplot(fig)

输出如下所示 −

The output of the same is as given below −

asymmeric error bar

Plotly - Heatmap

热力图(或热图)是数据的图形化表示,其中数据集中包含的各个值表示为颜色。热图的主要目的是更好地可视化数据集中的位置/事件的数量,并帮助将查看者定向到数据可视化中最重要的地方。

A heat map (or heatmap) is a graphical representation of data where the individual values contained in a matrix are represented as colors. The primary purpose of Heat Maps is to better visualize the volume of locations/events within a dataset and assist in directing viewers towards areas on data visualizations that matter most.

由于热图依赖于颜色来传达值,因此它们最常用于显示数字值的概括视图。热图在吸引注意力时极其通用且高效,正因如此,它们在分析社区中变得越来越受欢迎。

Because of their reliance on color to communicate values, Heat Maps are perhaps most commonly used to display a more generalized view of numeric values. Heat Maps are extremely versatile and efficient in drawing attention to trends, and it’s for these reasons they have become increasingly popular within the analytics community.

热图本质上是不言自明的。阴影越深,数量越大(值越高,分散越紧密,依此类推)。Plotly 的 graph_objects 模块包含 Heatmap() 函数。它需要 x、 yz 属性。它们的值可以是列表、numpy 数组或 Pandas 数据框。

Heat Maps are innately self-explanatory. The darker the shade, the greater the quantity (the higher the value, the tighter the dispersion, etc.). Plotly’s graph_objects module contains Heatmap() function. It needs x, y and z attributes. Their value can be a list, numpy array or Pandas dataframe.

在以下示例中,我们有一个 2D 列表或数组,其中定义了需要着色的数据(不同农民每年收获的吨/年)。然后,我们还需要两个农民姓名和他们耕种的蔬菜名称列表。

In the following example, we have a 2D list or array which defines the data (harvest by different farmers in tons/year) to color code. We then also need two lists of names of farmers and vegetables cultivated by them.

vegetables = [
   "cucumber",
   "tomato",
   "lettuce",
   "asparagus",
   "potato",
   "wheat",
   "barley"
]
farmers = [
   "Farmer Joe",
   "Upland Bros.",
   "Smith Gardening",
   "Agrifun",
   "Organiculture",
   "BioGoods Ltd.",
   "Cornylee Corp."
]
harvest = np.array(
   [
      [0.8, 2.4, 2.5, 3.9, 0.0, 4.0, 0.0],
      [2.4, 0.0, 4.0, 1.0, 2.7, 0.0, 0.0],
      [1.1, 2.4, 0.8, 4.3, 1.9, 4.4, 0.0],
      [0.6, 0.0, 0.3, 0.0, 3.1, 0.0, 0.0],
      [0.7, 1.7, 0.6, 2.6, 2.2, 6.2, 0.0],
      [1.3, 1.2, 0.0, 0.0, 0.0, 3.2, 5.1],
      [0.1, 2.0, 0.0, 1.4, 0.0, 1.9, 6.3]
   ]
)
trace = go.Heatmap(
   x = vegetables,
   y = farmers,
   z = harvest,
   type = 'heatmap',
   colorscale = 'Viridis'
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

上面提到的代码的输出如下 −

The output of the above mentioned code is given as follows −

heatmap

Plotly - Polar Chart and Radar Chart

在本章中,我们将学习如何在 Plotly 的帮助下制作极坐标图和雷达图。

In this chapter, we will learn how Polar Chart and Radar Chart can be made with the help Plotly.

首先,让我们研究一下极坐标图。

First of all, let us study about polar chart.

Polar Chart

极坐标图是圆形图的常见变体。当数据点之间的关系最容易通过半径和角度可视化时,它非常有用。

Polar Chart is a common variation of circular graphs. It is useful when relationships between data points can be visualized most easily in terms of radiuses and angles.

在极坐标图中,一个序列由一个闭合曲线表示,该曲线连接了极坐标系中的点。每个数据点由到极点的距离(径向坐标)和到固定方向的角度(角坐标)确定。

In Polar Charts, a series is represented by a closed curve that connect points in the polar coordinate system. Each data point is determined by the distance from the pole (the radial coordinate) and the angle from the fixed direction (the angular coordinate).

极坐标图沿着径向轴和角轴表示数据。径向坐标和角坐标使用 go.Scatterpolar() 函数的 rtheta 参数给出。theta 数据可以是分类的,但数值数据也是可能的并且是最常用的。

A polar chart represents data along radial and angular axes. The radial and angular coordinates are given with the r and theta arguments for go.Scatterpolar() function. The theta data can be categorical, but, numerical data are possible too and is the most commonly used.

以下代码生成一个基本的极坐标图。除了 r 和 theta 参数外,我们还将模式设置为 lines (它可以设置为标记,在这种情况下只显示数据点)。

Following code produces a basic polar chart. In addition to r and theta arguments, we set mode to lines (it can be set to markers well in which case only the data points will be displayed).

import numpy as np
r1 = [0,6,12,18,24,30,36,42,48,54,60]
t1 = [1,0.995,0.978,0.951,0.914,0.866,0.809,0.743,0.669,0.588,0.5]
trace = go.Scatterpolar(
   r = [0.5,1,2,2.5,3,4],
   theta = [35,70,120,155,205,240],
   mode = 'lines',
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

输出如下 −

The output is given below −

polar chart

在以下示例中,来自 comma-separated values (CSV) file 的数据用于生成极坐标图。 polar.csv 的前几行如下 -

In the following example data from a comma-separated values (CSV) file is used to generate polar chart. First few rows of polar.csv are as follows −

y,x1,x2,x3,x4,x5,
0,1,1,1,1,1,
6,0.995,0.997,0.996,0.998,0.997,
12,0.978,0.989,0.984,0.993,0.986,
18,0.951,0.976,0.963,0.985,0.969,
24,0.914,0.957,0.935,0.974,0.946,
30,0.866,0.933,0.9,0.96,0.916,
36,0.809,0.905,0.857,0.943,0.88,
42,0.743,0.872,0.807,0.923,0.838,
48,0.669,0.835,0.752,0.901,0.792,
54,0.588,0.794,0.691,0.876,0.74,
60,0.5,0.75,0.625,0.85,0.685,

在笔记本的输入单元中输入以下脚本以生成如下图所示的极坐标图 -

Enter the following script in notebook’s input cell to generate polar chart as below −

import pandas as pd
df = pd.read_csv("polar.csv")
t1 = go.Scatterpolar(
   r = df['x1'], theta = df['y'], mode = 'lines', name = 't1'
)
t2 = go.Scatterpolar(
   r = df['x2'], theta = df['y'], mode = 'lines', name = 't2'
)
t3 = go.Scatterpolar(
   r = df['x3'], theta = df['y'], mode = 'lines', name = 't3'
)
data = [t1,t2,t3]
fig = go.Figure(data = data)
iplot(fig)

以下为上述代码的输出 -

Given below is the output of the above mentioned code −

generate polar chart

Radar chart

雷达图(也称为 spider plotstar plot ),以从中心向外发散的轴上表示的定量变量二维图的形式显示多变量数据。这些轴的相对位置和角度通常没有信息量。

A Radar Chart (also known as a spider plot or star plot) displays multivariate data in the form of a two-dimensional chart of quantitative variables represented on axes originating from the center. The relative position and angle of the axes is typically uninformative.

对于雷达图,在一般情况下,使用具有 go.Scatterpolar() 函数中类别角变量的极坐标图。

For a Radar Chart, use a polar chart with categorical angular variables in go.Scatterpolar() function in the general case.

以下代码可呈现具有 Scatterpolar() function 的基本雷达图 -

Following code renders a basic radar chart with Scatterpolar() function

radar = go.Scatterpolar(
   r = [1, 5, 2, 2, 3],
   theta = [
      'processing cost',
      'mechanical properties',
      'chemical stability',
      'thermal stability',
      'device integration'
   ],
   fill = 'toself'
)
data = [radar]
fig = go.Figure(data = data)
iplot(fig)

下面提到的输出是给定代码的结果 -

The below mentioned output is a result of the above given code −

radar chart

OHLC Chart, Waterfall Chart and Funnel Chart

本章重点介绍了其他三种类型的图表,包括 OHLC、瀑布图和漏斗图,这些图表可以在 Plotly 的帮助下制作。

This chapter focusses on other three types of charts including OHLC, Waterfall and Funnel Chart which can be made with the help of Plotly.

OHLC Chart

一种 open-high-low-close 图表(也称作 OHLC)是一种 bar chart ,通常用于说明诸如股票等金融工具价格的变动。OHLC 图表很有用,因为它们显示了一段期间内的四个主要数据点。这类图表之所以有用,是因为它可以显示上升或下降的动能。高点和低点数据对于评估波动很有用。

An open-high-low-close chart (also OHLC) is a type of bar chart typically used to illustrate movements in the price of a financial instrument such as shares. OHLC charts are useful since they show the four major data points over a period. The chart type is useful because it can show increasing or decreasing momentum. The high and low data points are useful in assessing volatility.

图表上的每条垂直线均显示一个时间单位(如一天或一小时)内的价格范围(最高价和最低价)。标号从每条线的两侧突出显示,左侧表示开盘价(例如,对于日线图,这将是该天的开盘价),右侧表示该时间段的收盘价。

Each vertical line on the chart shows the price range (the highest and lowest prices) over one unit of time, such as day or hour. Tick marks project from each side of the line indicating the opening price (e.g., for a daily bar chart this would be the starting price for that day) on the left, and the closing price for that time period on the right.

下面显示了用于演示 OHLC 图表的示例数据。此数据具有与相应日期字符串相对应的代表高、低、开盘和收盘值的对象列表。使用 datetime 模块中的 strtp() 函数将字符串的日期表示转换为日期对象。

Sample data for demonstration of OHLC chart is shown below. It has list objects corresponding to high, low, open and close values as on corresponding date strings. The date representation of string is converted to date object by using strtp() function from datetime module.

open_data = [33.0, 33.3, 33.5, 33.0, 34.1]
high_data = [33.1, 33.3, 33.6, 33.2, 34.8]
low_data = [32.7, 32.7, 32.8, 32.6, 32.8]
close_data = [33.0, 32.9, 33.3, 33.1, 33.1]
date_data = ['10-10-2013', '11-10-2013', '12-10-2013','01-10-2014','02-10-2014']
import datetime
dates = [
   datetime.datetime.strptime(date_str, '%m-%d-%Y').date()
   for date_str in date_data
]

必须使用上述日期对象作为 x 参数,以及 go.Ohlc() 函数所需的开盘、高、低和收盘参数,该函数返回 OHLC 轨迹。

We have to use above dates object as x parameter and others for open, high, low and close parameters required for go.Ohlc() function that returns OHLC trace.

trace = go.Ohlc(
   x = dates,
   open = open_data,
   high = high_data,
   low = low_data,
   close = close_data
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

下面给出代码的输出 −

The output of the code is given below −

ohlc chart

Candlestick Chart

candlestick chart 与 OHLC 图表类似。它类似于 line-chartbar-chart 的组合。这些方框表示开盘价和收盘价之间的价差,而线段表示低价和高价之间的价差。收盘价高于(低于)开盘价的示例点被称为上升(下降)。

The candlestick chart is similar to OHLC chart. It is like a combination of line-chart and a bar-chart. The boxes represent the spread between the open and close values and the lines represent the spread between the low and high values. Sample points where the close value is higher (lower) then the open value are called increasing (decreasing).

go.Candlestick() function 返回烛台轨迹。我们使用相同数据(与 OHLC 图表相同)来渲染烛台图,如下所示 -

Candlestrick trace is returned by go.Candlestick() function. We use same data (as for OHLC chart) to render candlestick chart as given below −

trace = go.Candlestick(
   x = dates,
   open = open_data,
   high = high_data,
   low = low_data,
   close = close_data
)

以下是给定代码的输出 -

Output of the above given code is mentioned below −

candlestick chart

Waterfall chart

瀑布图(也称为 flying bricks chart or Mario chart )有助于理解按顺序引入的正或负值(其可以基于时间或基于类别)的累积效应。

A waterfall chart (also known as flying bricks chart or Mario chart) helps in understanding the cumulative effect of sequentially introduced positive or negative values which can either be time based or category based.

起始值和结束值显示为柱状图,各个负调整值和正调整值显示为浮动步长。一些瀑布图连接柱状图之间的线段,以使图表看起来像桥梁。

Initial and final values are shown as columns with the individual negative and positive adjustments depicted as floating steps. Some waterfall charts connect the lines between the columns to make the chart look like a bridge.

go.Waterfall() 函数返回一个瀑布追踪。该对象可以通过各种命名参数或属性进行定制。这里,x 和 y 属性设置了图形的 x 和 y 坐标的数据。两者都可以是 Python 列表、numpy 数组、Pandas 时间序列、字符串或日期时间对象。

go.Waterfall() function returns a Waterfall trace. This object can be customized by various named arguments or attributes. Here, x and y attributes set up data for x and y coordinates of the graph. Both can be a Python list, numpy array or Pandas series or strings or date time objects.

另一个属性是 measure ,它是一个包含值类型的数组。默认情况下,这些值被认为是 relative 。将其设置为“total”以计算总和。如果它等于 absolute ,它会重置计算的总和或根据需要声明一个初始值。“base”属性设置条形基准绘制的位置(以位置轴单位为单位)。

Another attribute is measure which is an array containing types of values. By default, the values are considered as relative. Set it to 'total' to compute the sums. If it is equal to absolute it resets the computed total or to declare an initial value where needed. The 'base' attribute sets where the bar base is drawn (in position axis units).

以下代码呈现一张瀑布图 −

Following code renders a waterfall chart −

s1=[
   "Sales",
   "Consulting",
   "Net revenue",
   "Purchases",
   "Other expenses",
   "Profit before tax"
]
s2 = [60, 80, 0, -40, -20, 0]
trace = go.Waterfall(
   x = s1,
   y = s2,
   base = 200,
   measure = [
      "relative",
      "relative",
      "total",
      "relative",
      "relative",
      "total"
   ]
)
data = [trace]
fig = go.Figure(data = data)
iplot(fig)

下面提到的输出是上面给出的代码的结果。

Below mentioned output is a result of the code given above.

waterfall chart

Funnel Chart

漏斗图以业务流程的不同阶段表示数据。它是商业智能中识别流程潜在问题区域的重要机制。漏斗图用于可视化数据在从一个阶段传递到另一个阶段时如何逐步减少。这些阶段中的每个阶段的数据都表示为 100%(全部)的不同部分。

Funnel charts represent data in different stages of a business process. It is an important mechanism in Business Intelligence to identify potential problem areas of a process. Funnel chart is used to visualize how data reduces progressively as it passes from one phase to another. Data in each of these phases is represented as different portions of 100% (the whole).

与饼图一样,漏斗图也不使用任何轴。它也可以被认为类似于 stacked percent bar chart 。任何漏斗都由称为头部(或底部)的上部和称为颈部的下部组成。漏斗图最常见的用途是可视化销售转化数据。

Like the Pie chart, the Funnel chart does not use any axes either. It can also be treated as similar to a stacked percent bar chart. Any funnel consists of the higher part called head (or base) and the lower part referred to as neck. The most common use of the Funnel chart is in visualizing sales conversion data.

Plotly 的 go.Funnel() 函数生成漏斗追踪。向此函数提供的必要属性为 x 和 y 。每个属性都分配了一个项目 Python 列表或一个数组。

Plotly’s go.Funnel() function produces Funnel trace. Essential attributes to be provided to this function are x and y. Each of them is assigned a Python list of items or an array.

from plotly import graph_objects as go
fig = go.Figure(
   go.Funnel(
      y = [
         "Website visit",
         "Downloads",
         "Potential customers",
         "Requested price",
         "invoice sent"
      ],
      x = [39, 27.4, 20.6, 11, 2]
   )
)
fig.show()

输出如下所示:

The output is as given below −

funnel chart

Plotly - 3D Scatter and Surface Plot

本章将提供有关三维(3D)散点图和 3D 曲面图的信息,以及如何借助 Plotly 制作它们。

This chapter will give information about the three-dimensional (3D) Scatter Plot and 3D Surface Plot and how to make them with the help of Plotly.

3D Scatter Plot

三维(3D)散点图类似于散点图,但有三个变量 - x, y, and z or f(x, y) 是实数。此图形可表示为三维笛卡尔坐标系中的点。通常使用透视法(等距或透视)将其绘制在二维页面或屏幕上,使得其中一个维度似乎从页面中出现。

A three-dimensional (3D) scatter plot is like a scatter plot, but with three variables - x, y, and z or f(x, y) are real numbers. The graph can be represented as dots in a three-dimensional Cartesian coordinate system. It is typically drawn on a two-dimensional page or screen using perspective methods (isometric or perspective), so that one of the dimensions appears to be coming out of the page.

3D 散点图用于在三个轴上绘制数据点,以试图显示三个变量之间的关系。数据表中的每一行都由一个标记表示,其位置取决于其在 X, Y, and Z axes 上设置的列中的值。

3D scatter plots are used to plot data points on three axes in an attempt to show the relationship between three variables. Each row in the data table is represented by a marker whose position depends on its values in the columns set on the X, Y, and Z axes.

可以设置第四个变量以对应 markerscolorsize ,从而为该图形添加另一个维度。不同变量之间的关系称为 correlation

A fourth variable can be set to correspond to the color or size of the markers, thus, adding yet another dimension to the plot. The relationship between different variables is called correlation.

Scatter3D trace 是 go.Scatter3D() 函数返回的图形对象。此函数的必需参数是 x, y and z ,它们中的每一个都是 list or array object

A Scatter3D trace is a graph object returned by go.Scatter3D() function. Mandatory arguments to this function are x, y and z each of them is a list or array object.

例如 -

For example −

import plotly.graph_objs as go
import numpy as np
z = np.linspace(0, 10, 50)
x = np.cos(z)
y = np.sin(z)
trace = go.Scatter3d(
   x = x, y = y, z = z,mode = 'markers', marker = dict(
      size = 12,
      color = z, # set color to an array/list of desired values
      colorscale = 'Viridis'
      )
   )
layout = go.Layout(title = '3D Scatter plot')
fig = go.Figure(data = [trace], layout = layout)
iplot(fig)

下面给出代码的输出 −

The output of the code is given below −

3d scatter plot

3D Surface Plot

曲面图是三维数据图。在曲面图中,每个点由 3 个点定义:其 latitudelongitudealtitude (X、Y 和 Z)。曲面图不会显示单个数据点,而是显示指定 dependent variable (Y) 与两个自变量(X 和 Z)之间的函数关系。此图形是等值线图的伴随图形。

Surface plots are diagrams of three-dimensional data. In a surface plot, each point is defined by 3 points: its latitude, longitude, and altitude (X, Y and Z). Rather than showing the individual data points, surface plots show a functional relationship between a designated dependent variable (Y), and two independent variables (X and Z). This plot is a companion plot to the contour plot.

这里有一个 Python 脚本,用于呈现简单的曲面图,其中 y array 是 x 的转置,z 计算为 cos(x2+y2)

Here, is a Python script to render simple surface plot where y array is transpose of x and z is calculated as cos(x2+y2)

import numpy as np
x = np.outer(np.linspace(-2, 2, 30), np.ones(30))
y = x.copy().T # transpose
z = np.cos(x ** 2 + y ** 2)
trace = go.Surface(x = x, y = y, z =z )
data = [trace]
layout = go.Layout(title = '3D Surface plot')
fig = go.Figure(data = data)
iplot(fig)

下面提到了上面解释的代码的输出 −

Below mentioned is the output of the code which is explained above −

3d surface plot

Plotly - Adding Buttons Dropdown

Plotly 通过在绘图区域上使用不同的控件(例如按钮、下拉菜单和滑块等)来提供高度的交互性。这些控件与图形布局的 updatemenu 属性结合使用。可以通过指定要调用的方法来 add button 及其行为。

Plotly provides high degree of interactivity by use of different controls on the plotting area – such as buttons, dropdowns and sliders etc. These controls are incorporated with updatemenu attribute of the plot layout. You can add button and its behaviour by specifying the method to be called.

可以与按钮关联的四种方法如下所示 −

There are four possible methods that can be associated with a button as follows −

  1. restyle − modify data or data attributes

  2. relayout − modify layout attributes

  3. update − modify data and layout attributes

  4. animate − start or pause an animation

图形的 modifying the data and data attributes 时,应该使用 restyle 方法。在下面的示例中,通过 restyle 方法将两个按钮添加到布局中,方法为 Updatemenu()

The restyle method should be used when modifying the data and data attributes of the graph. In the following example, two buttons are added by Updatemenu() method to the layout with restyle method.

go.layout.Updatemenu(
type = "buttons",
direction = "left",
buttons = list([
   dict(args = ["type", "box"], label = "Box", method = "restyle"),
   dict(args = ["type", "violin"], label = "Violin", method = "restyle" )]
))

默认情况下, type 属性的值为 buttons 。要显示按钮的下拉列表,将类型更改为 dropdown 。在更新其布局之前,向 Figure 对象添加一个框跟踪。以下是如何根据单击按钮显示 boxplotviolin plot 的完整代码 −

Value of type property is buttons by default. To render a dropdown list of buttons, change type to dropdown. A Box trace added to Figure object before updating its layout as above. The complete code that renders boxplot and violin plot depending on button clicked, is as follows −

import plotly.graph_objs as go
fig = go.Figure()
fig.add_trace(go.Box(y = [1140,1460,489,594,502,508,370,200]))
fig.layout.update(
   updatemenus = [
      go.layout.Updatemenu(
         type = "buttons", direction = "left", buttons=list(
            [
               dict(args = ["type", "box"], label = "Box", method = "restyle"),
               dict(args = ["type", "violin"], label = "Violin", method = "restyle")
            ]
         ),
         pad = {"r": 2, "t": 2},
         showactive = True,
         x = 0.11,
         xanchor = "left",
         y = 1.1,
         yanchor = "top"
      ),
   ]
)
iplot(fig)

下面给出代码的输出 −

The output of the code is given below −

violin button

单击 Violin 按钮以显示相应的 Violin plot

Click on Violin button to display corresponding Violin plot.

dropdown list button

如上所述, Updatemenu() 方法中 type 键的值被赋值为 dropdown 以显示按钮的下拉列表。绘图如下所示 -

As mentioned above, value of type key in Updatemenu() method is assigned dropdown to display dropdown list of buttons. The plot appears as below −

update method

修改图形的数据和布局部分时,应使用 update 方法。以下示例演示如何在同时更新布局属性(例如图表标题)时更新和显示痕迹。两个对应于 sine and cos wave 的散点迹线添加到 Figure object 。可见性 attributeTrue 的痕迹将显示在绘图上,其他痕迹将被隐藏。

The update method should be used when modifying the data and layout sections of the graph. Following example demonstrates how to update and which traces are displayed while simultaneously updating layout attributes, such as, the chart title. Two Scatter traces corresponding to sine and cos wave are added to Figure object. The trace with visible attribute as True will be displayed on the plot and other traces will be hidden.

import numpy as np
import math #needed for definition of pi

xpoints = np.arange(0, math.pi*2, 0.05)
y1 = np.sin(xpoints)
y2 = np.cos(xpoints)
fig = go.Figure()
# Add Traces
fig.add_trace(
   go.Scatter(
      x = xpoints, y = y1, name = 'Sine'
   )
)
fig.add_trace(
   go.Scatter(
      x = xpoints, y = y2, name = 'cos'
   )
)
fig.layout.update(
   updatemenus = [
      go.layout.Updatemenu(
         type = "buttons", direction = "right", active = 0, x = 0.1, y = 1.2,
         buttons = list(
            [
               dict(
                  label = "first", method = "update",
                  args = [{"visible": [True, False]},{"title": "Sine"} ]
               ),
               dict(
                  label = "second", method = "update",
                  args = [{"visible": [False, True]},{"title": Cos"}]
               )
            ]
         )
      )
   ]
)
iplot(fig)

最初,将显示 Sine curve 。如果单击第二个按钮,则会出现 cos trace

Initially, Sine curve will be displayed. If clicked on second button, cos trace appears.

请注意, chart title 也会相应地更新。

Note that chart title also updates accordingly.

sine curve

为了使用 animate 方法,我们需要添加一个或多个 Frames to the Figure 对象。除了数据和布局之外,还可以将框架作为图形对象中的密钥添加。框架密钥指向一系列图形,其中每一个图形将在触发动画时循环遍历。

In order to use animate method, we need to add one or more Frames to the Figure object. Along with data and layout, frames can be added as a key in a figure object. The frames key points to a list of figures, each of which will be cycled through when animation is triggered.

你可以添加播放和暂停按钮,通过在布局中添加 updatemenus array ,为图表引入动画。

You can add, play and pause buttons to introduce animation in chart by adding an updatemenus array to the layout.

"updatemenus": [{
   "type": "buttons", "buttons": [{
      "label": "Your Label", "method": "animate", "args": [frames]
   }]
}]

在以下示例中,首先绘制 scatter curve 轨迹。然后添加 frames ,它是一个 50 Frame objects 列表,每个列表代表曲线上的 red marker 。请注意,按钮的 args 属性被设置为 None,因此所有帧都将被动画化。

In the following example, a scatter curve trace is first plotted. Then add frames which is a list of 50 Frame objects, each representing a red marker on the curve. Note that the args attribute of button is set to None, due to which all frames are animated.

import numpy as np
t = np.linspace(-1, 1, 100)
x = t + t ** 2
y = t - t ** 2
xm = np.min(x) - 1.5
xM = np.max(x) + 1.5
ym = np.min(y) - 1.5
yM = np.max(y) + 1.5
N = 50
s = np.linspace(-1, 1, N)
#s = np.arange(0, math.pi*2, 0.1)
xx = s + s ** 2
yy = s - s ** 2
fig = go.Figure(
   data = [
      go.Scatter(x = x, y = y, mode = "lines", line = dict(width = 2, color = "blue")),
      go.Scatter(x = x, y = y, mode = "lines", line = dict(width = 2, color = "blue"))
   ],
   layout = go.Layout(
      xaxis=dict(range=[xm, xM], autorange=False, zeroline=False),
      yaxis=dict(range=[ym, yM], autorange=False, zeroline=False),
      title_text="Moving marker on curve",
      updatemenus=[
         dict(type="buttons", buttons=[dict(label="Play", method="animate", args=[None])])
      ]
   ),
   frames = [go.Frame(
      data = [
            go.Scatter(
            x = [xx[k]], y = [yy[k]], mode = "markers", marker = dict(
               color = "red", size = 10
            )
         )
      ]
   )
   for k in range(N)]
)
iplot(fig)

代码的输出如下所示 -

The output of the code is stated below −

play button

单击 play 按钮后,红色标记将开始沿着曲线移动。

The red marker will start moving along the curve on clicking play button.

Plotly - Slider Control

Plotly 有一个方便的 Slider ,可用于通过滑动位于渲染绘图底部的控件上的旋钮,更改绘图的 data/style 视图。

Plotly has a convenient Slider that can be used to change the view of data/style of a plot by sliding a knob on the control which is placed at the bottom of rendered plot.

Slider control 由不同的属性组成,如下所示 −

Slider control is made up of different properties which are as follows −

  1. steps property is required for defining sliding positions of knob over the control.

  2. method property is having possible values as restyle | relayout | animate | update | skip, default is restyle.

  3. args property sets the arguments values to be passed to the Plotly method set in method on slide.

我们现在在散点图上部署一个简单的滑块控件,该控件将使 sine wave 的频率随着旋钮沿着控件滑动而变化。滑块配置为有 50 个步骤。首先添加 50 条具有递增频率的正弦波曲线,除第 10 条迹线外,其余所有迹线都设置为可见。

We now deploy a simple slider control on a scatter plot which will vary the frequency of sine wave as the knob slides along the control. The slider is configured to have 50 steps. First add 50 traces of sine wave curve with incrementing frequency, all but 10th trace set to visible.

然后,我们使用 restyle 方法配置每个步骤。对于每个步骤,所有其他步骤对象的可见性都设置为 false 。最后,通过初始化滑块属性来更新图形对象的布局。

Then, we configure each step with restyle method. For each step, all other step objects have visibility set to false. Finally, update Figure object’s layout by initializing sliders property.

# Add traces, one for each slider step
for step in np.arange(0, 5, 0.1):
fig.add_trace(
   go.Scatter(
      visible = False,
      line = dict(color = "blue", width = 2),
      name = "𝜈 = " + str(step),
      x = np.arange(0, 10, 0.01),
      y = np.sin(step * np.arange(0, 10, 0.01))
   )
)
fig.data[10].visible=True

# Create and add slider
steps = []
for i in range(len(fig.data)):
step = dict(
   method = "restyle",
   args = ["visible", [False] * len(fig.data)],
)
step["args"][1][i] = True # Toggle i'th trace to "visible"
steps.append(step)
sliders = [dict(active = 10, steps = steps)]
fig.layout.update(sliders=sliders)
iplot(fig)

首先, 10th sine wave 迹线将可见。尝试沿底部水平控件滑动旋钮。你将看到频率发生变化,如下所示。

To begin with, 10th sine wave trace will be visible. Try sliding the knob across the horizontal control at the bottom. You will see the frequency changing as shown below.

sine wave trace

Plotly - FigureWidget Class

Plotly 3.0.0 引入一个新的 Jupyter 小部件类: plotly.graph_objs.FigureWidget 。它具有与我们现有的图形相同的调用签名,并且专门用于 Jupyter NotebookJupyterLab environments

Plotly 3.0.0 introduces a new Jupyter widget class: plotly.graph_objs.FigureWidget. It has the same call signature as our existing Figure, and it is made specifically for Jupyter Notebook and JupyterLab environments.

go.FigureWiget() function 返回一个具有默认 x 和 y 轴的空 FigureWidget 对象。

The go.FigureWiget() function returns an empty FigureWidget object with default x and y axes.

f = go.FigureWidget()
iplot(f)

以下是代码的输出 −

Given below is the output of the code −

figure widget graph

FigureWidget 最重要的特点是生成的结果 Plotly 图形,并且当我们将数据和其他布局属性添加到它时,它会动态更新。

Most important feature of FigureWidget is the resulting Plotly figure and it is dynamically updatable as we go on adding data and other layout attributes to it.

例如,逐个添加以下图形迹线并查看原始空图形动态更新。这意味着我们不必反复调用 iplot() 函数,因为绘图会自动刷新。FigureWidget 的最终外观如下所示 -

For example, add following graph traces one by one and see the original empty figure dynamically updated. That means we don’t have to call iplot() function again and again as the plot is refreshed automatically. Final appearance of the FigureWidget is as shown below −

f.add_scatter(y = [2, 1, 4, 3]);
f.add_bar(y = [1, 4, 3, 2]);
f.layout.title = 'Hello FigureWidget'
figure widget

此小部件能够对悬停、单击和选择点以及缩放区域的事件进行监听。

This widget is capable of event listeners for hovering, clicking, and selecting points and zooming into regions.

在以下示例中,FigureWidget 被编程为响应绘图区域中的单击事件。小部件本身包含具有标记的简单散点图。鼠标单击位置用不同的颜色和大小标记。

In following example, the FigureWidget is programmed to respond to click event on plot area. The widget itself contains a simple scatter plot with markers. The mouse click location is marked with different color and size.

x = np.random.rand(100)
y = np.random.rand(100)
f = go.FigureWidget([go.Scatter(x=x, y=y, mode='markers')])

scatter = f.data[0]
colors = ['#a3a7e4'] * 100

scatter.marker.color = colors
scatter.marker.size = [10] * 100
f.layout.hovermode = 'closest'
def update_point(trace, points, selector):

c = list(scatter.marker.color)
s = list(scatter.marker.size)
for i in points.point_inds:

c[i] = 'red'
s[i] = 20

scatter.marker.color = c
scatter.marker.size = s
scatter.on_click(update_point)
f

在 Jupyter Notebook 中运行以上代码。将显示一个散点图。单击该区域中的某个位置,该位置将用红色标记。

Run above code in Jupyter notebook. A scatter plot is displayed. Click on a location in the area which will be markd with red colour.

location

Plotly 的 FigureWidget 对象还可以使用 Ipython’s 自己的小部件。此处,我们使用 ipwidgets 模块中定义的交互控制。我们首先构造一个 FigureWidget 并添加一个 empty scatter plot

Plotly’s FigureWidget object can also make use of Ipython’s own widgets. Here, we use interact control as defined in ipwidgets module. We first construct a FigureWidget and add an empty scatter plot.

from ipywidgets import interact
fig = go.FigureWidget()
scatt = fig.add_scatter()
fig

我们现在定义一个 update function ,它输入频率和相位,并设置上面定义的 scatter trace 的 x 和 y 属性。来自 ipywidgets 模块的 @interact decorator 用于创建一组简单的部件来控制绘图的参数。更新函数使用来自 ipywidgets package@interact decorator 进行装饰。装饰器参数用于指定我们要扫过的参数范围。

We now define an update function that inputs the frequency and phase and sets the x and y properties of the scatter trace defined above. The @interact decorator from ipywidgets module is used to create a simple set of widgets to control the parameters of a plot. The update function is decorated with @interact decorator from the ipywidgets package. The decorator parameters are used to specify the ranges of parameters that we want to sweep over.

xs = np.linspace(0, 6, 100)
@interact(a = (1.0, 4.0, 0.01), b = (0, 10.0, 0.01), color = ['red', 'green', 'blue'])
def update(a = 3.6, b = 4.3, color = 'blue'):
with fig.batch_update():
scatt.x = xs
scatt.y = np.sin(a*xs-b)
scatt.line.color = color

空 FigureWidget 现在填充为蓝色,其中 sine curve a 和 b 分别为 3.6 和 4.3。在当前 Notebook 单元格的下方,你将获得一组滑块,用于选择 ab 的值。还有一个下拉列表可用于选择迹线颜色。这些参数在 @interact decorator 中定义。

Empty FigureWidget is now populated in blue colour with sine curve a and b as 3.6 and 4.3 respectively. Below the current notebook cell, you will get a group of sliders for selecting values of a and b. There is also a dropdown to select the trace color. These parameters are defined in @interact decorator.

interact decorator

Pandas 是 Python 中一个非常流行的数据分析库。它也有自己的绘图函数支持。但是,Pandas 绘图没有提供可视化交互性。值得庆幸的是,使用 Pandas dataframe 对象可以构建 plotly 的交互式动态绘图。

Pandas is a very popular library in Python for data analysis. It also has its own plot function support. However, Pandas plots don’t provide interactivity in visualization. Thankfully, plotly’s interactive and dynamic plots can be built using Pandas dataframe objects.

我们从简单列表对象构建一个数据框开始。

We start by building a Dataframe from simple list objects.

data = [['Ravi',21,67],['Kiran',24,61],['Anita',18,46],['Smita',20,78],['Sunil',17,90]]
df = pd.DataFrame(data,columns = ['name','age','marks'],dtype = float)

数据框列用作图形对象追踪的 xy 属性的数据值。这里,我们将使用 namemarks 列生成一个条形追踪。

The dataframe columns are used as data values for x and y properties of graph object traces. Here, we will generate a bar trace using name and marks columns.

trace = go.Bar(x = df.name, y = df.marks)
fig = go.Figure(data = [trace])
iplot(fig)

将在 Jupyter 笔记本中显示一个简单的条形图,如下所示 −

A simple bar plot will be displayed in Jupyter notebook as below −

pandas dataframes

Plotly 的构建基于 d3.js ,它是一个专门的图表库,可以使用另一个名为 Cufflinks 的库直接与 Pandas dataframes 一起使用。

Plotly is built on top of d3.js and is specifically a charting library which can be used directly with Pandas dataframes using another library named Cufflinks.

如果尚未安装,可以使用你喜欢的包管理器(如 pip )按照如下方式安装 cufflinks 包:

If not already available, install cufflinks package by using your favourite package manager like pip as given below −

pip install cufflinks
or
conda install -c conda-forge cufflinks-py

首先,导入 cufflinks 以及其他库,如 Pandasnumpy ,这些库可以将其配置为离线使用。

First, import cufflinks along with other libraries such as Pandas and numpy which can configure it for offline use.

import cufflinks as cf
cf.go_offline()

现在,你可以直接使用 Pandas dataframe 显示各种类型的图表,而无需使用 graph_objs module 的 trace 和 figure 对象,后者是我们之前一直在使用的。

Now, you can directly use Pandas dataframe to display various kinds of plots without having to use trace and figure objects from graph_objs module as we have been doing previously.

df.iplot(kind = 'bar', x = 'name', y = 'marks')

下文将显示与之前非常相似的条形图:

Bar plot, very similar to earlier one will be displayed as given below −

pandas dataframe cufflinks

Pandas dataframes from databases

除了使用 Python 列表构造 DataFrame 外,还可以用不同类型的数据库中的数据填充它。例如,可以将 CSV 文件、SQLite 数据库表或 mysql 数据库表中的数据获取到 Pandas DataFrame 中,进而使用 Figure objectCufflinks interface 导出为 plotly 图表。

Instead of using Python lists for constructing dataframe, it can be populated by data in different types of databases. For example, data from a CSV file, SQLite database table or mysql database table can be fetched into a Pandas dataframe, which eventually is subjected to plotly graphs using Figure object or Cufflinks interface.

要从 CSV file 获取数据,我们可以使用 Pandas 库的 read_csv() 函数。

To fetch data from CSV file, we can use read_csv() function from Pandas library.

import pandas as pd
df = pd.read_csv('sample-data.csv')

如果数据在 SQLite database table 中可用,则可以使用 SQLAlchemy library 如下方式获取:

If data is available in SQLite database table, it can be retrieved using SQLAlchemy library as follows −

import pandas as pd
from sqlalchemy import create_engine
disk_engine = create_engine('sqlite:///mydb.db')
df = pd.read_sql_query('SELECT name,age,marks', disk_engine)

另一方面,从 MySQL database 中获取的数据在 Pandas DataFrame 中如下所示:

On the other hand, data from MySQL database is retrieved in a Pandas dataframe as follows −

import pymysql
import pandas as pd
conn = pymysql.connect(host = "localhost", user = "root", passwd = "xxxx", db = "mydb")
cursor = conn.cursor()
cursor.execute('select name,age,marks')
rows = cursor.fetchall()
df = pd.DataFrame( [[ij for ij in i] for i in rows] )
df.rename(columns = {0: 'Name', 1: 'age', 2: 'marks'}, inplace = True)

Plotly with Matplotlib and Chart Studio

本章介绍了名为 Matplotlib 的数据可视化库和名为 Chart Studio 的在线图表制作工具。

This chapter deals with data visualization library titled Matplotlib and online plot maker named Chart Studio.

Matplotlib

Matplotlib 是一个流行的 Python 数据可视化库,能够制作可用于生产但静态的图表。您可以借助 plotly.tools 模块中的 mpl_to_plotly() 函数将静态 matplotlib figures 转换为交互式图表。

Matplotlib is a popular Python data visualization library capable of producing production-ready but static plots. you can convert your static matplotlib figures into interactive plots with the help of mpl_to_plotly() function in plotly.tools module.

以下脚本使用 Matplotlib’s PyPlot API 生成 Sine wave Line plot

Following script produces a Sine wave Line plot using Matplotlib’s PyPlot API.

from matplotlib import pyplot as plt
import numpy as np
import math
#needed for definition of pi
x = np.arange(0, math.pi*2, 0.05)
y = np.sin(x)
plt.plot(x,y)
plt.xlabel("angle")
plt.ylabel("sine")
plt.title('sine wave')
plt.show()

现在,我们按如下方式将其转换为绘图线图形:

Now we shall convert it into a plotly figure as follows −

fig = plt.gcf()
plotly_fig = tls.mpl_to_plotly(fig)
py.iplot(plotly_fig)

代码的输出如下所示 −

The output of the code is as given below −

matplotlib

Chart Studio

Chart Studio 是 Plotly 提供的在线图表制作工具。它提供了一个图形用户界面,用于将数据导入网格并对其进行分析,并使用统计工具。可以嵌入或下载图表。它主要用于更快速、更高效地创建图表。

Chart Studio is an online plot maker tool made available by Plotly. It provides a graphical user interface for importing and analyzing data into a grid and using stats tools. Graphs can be embedded or downloaded. It is mainly used to enable creating graphs faster and more efficiently.

登录 plotly 帐户后,通过访问链接 https://plot.ly/create 启动图表工作室应用程序。网页在图表区域下方提供了一个空白工作表。Chart Studio 允许您通过按 + trace button 来添加图表轨迹。

After logging in to plotly’s account, start the chart studio app by visiting the link https://plot.ly/create. The web page offers a blank work sheet below the plot area. Chart Studio lets you to add plot traces by pushing + trace button.

chart studio

菜单中提供了各种图表结构元素,如注释、样式等,以及保存、导出和共享图表的工具。

Various plot structure elements such as annotations, style etc. as well as facility to save, export and share the plots is available in the menu.

让我们在工作表中添加数据并从轨迹类型中添加 choose bar plot trace

Let us add data in the worksheet and add choose bar plot trace from the trace types.

choose bar

单击类型文本框,然后选择条形图。

Click in the type text box and select bar plot.

select bar

然后,为 xy 轴提供数据列,并输入图表标题。

Then, provide data columns for x and y axes and enter plot title.

data columns