Plotly 简明教程
Plotly with Pandas and Cufflinks
Pandas 是 Python 中一个非常流行的数据分析库。它也有自己的绘图函数支持。但是,Pandas 绘图没有提供可视化交互性。值得庆幸的是,使用 Pandas dataframe 对象可以构建 plotly 的交互式动态绘图。
Pandas is a very popular library in Python for data analysis. It also has its own plot function support. However, Pandas plots don’t provide interactivity in visualization. Thankfully, plotly’s interactive and dynamic plots can be built using Pandas dataframe objects.
我们从简单列表对象构建一个数据框开始。
We start by building a Dataframe from simple list objects.
data = [['Ravi',21,67],['Kiran',24,61],['Anita',18,46],['Smita',20,78],['Sunil',17,90]]
df = pd.DataFrame(data,columns = ['name','age','marks'],dtype = float)
数据框列用作图形对象追踪的 x 和 y 属性的数据值。这里,我们将使用 name 和 marks 列生成一个条形追踪。
The dataframe columns are used as data values for x and y properties of graph object traces. Here, we will generate a bar trace using name and marks columns.
trace = go.Bar(x = df.name, y = df.marks)
fig = go.Figure(data = [trace])
iplot(fig)
将在 Jupyter 笔记本中显示一个简单的条形图,如下所示 −
A simple bar plot will be displayed in Jupyter notebook as below −
Plotly 的构建基于 d3.js ,它是一个专门的图表库,可以使用另一个名为 Cufflinks 的库直接与 Pandas dataframes 一起使用。
Plotly is built on top of d3.js and is specifically a charting library which can be used directly with Pandas dataframes using another library named Cufflinks.
如果尚未安装,可以使用你喜欢的包管理器(如 pip )按照如下方式安装 cufflinks 包:
If not already available, install cufflinks package by using your favourite package manager like pip as given below −
pip install cufflinks
or
conda install -c conda-forge cufflinks-py
首先,导入 cufflinks 以及其他库,如 Pandas 和 numpy ,这些库可以将其配置为离线使用。
First, import cufflinks along with other libraries such as Pandas and numpy which can configure it for offline use.
import cufflinks as cf
cf.go_offline()
现在,你可以直接使用 Pandas dataframe 显示各种类型的图表,而无需使用 graph_objs module 的 trace 和 figure 对象,后者是我们之前一直在使用的。
Now, you can directly use Pandas dataframe to display various kinds of plots without having to use trace and figure objects from graph_objs module as we have been doing previously.
df.iplot(kind = 'bar', x = 'name', y = 'marks')
下文将显示与之前非常相似的条形图:
Bar plot, very similar to earlier one will be displayed as given below −
Pandas dataframes from databases
除了使用 Python 列表构造 DataFrame 外,还可以用不同类型的数据库中的数据填充它。例如,可以将 CSV 文件、SQLite 数据库表或 mysql 数据库表中的数据获取到 Pandas DataFrame 中,进而使用 Figure object 或 Cufflinks interface 导出为 plotly 图表。
Instead of using Python lists for constructing dataframe, it can be populated by data in different types of databases. For example, data from a CSV file, SQLite database table or mysql database table can be fetched into a Pandas dataframe, which eventually is subjected to plotly graphs using Figure object or Cufflinks interface.
要从 CSV file 获取数据,我们可以使用 Pandas 库的 read_csv() 函数。
To fetch data from CSV file, we can use read_csv() function from Pandas library.
import pandas as pd
df = pd.read_csv('sample-data.csv')
如果数据在 SQLite database table 中可用,则可以使用 SQLAlchemy library 如下方式获取:
If data is available in SQLite database table, it can be retrieved using SQLAlchemy library as follows −
import pandas as pd
from sqlalchemy import create_engine
disk_engine = create_engine('sqlite:///mydb.db')
df = pd.read_sql_query('SELECT name,age,marks', disk_engine)
另一方面,从 MySQL database 中获取的数据在 Pandas DataFrame 中如下所示:
On the other hand, data from MySQL database is retrieved in a Pandas dataframe as follows −
import pymysql
import pandas as pd
conn = pymysql.connect(host = "localhost", user = "root", passwd = "xxxx", db = "mydb")
cursor = conn.cursor()
cursor.execute('select name,age,marks')
rows = cursor.fetchall()
df = pd.DataFrame( [[ij for ij in i] for i in rows] )
df.rename(columns = {0: 'Name', 1: 'age', 2: 'marks'}, inplace = True)