Pandas 中文参考指南
Table Visualization
此节演示如何使用 Styler 类对表格数据进行可视化处理。有关使用图表进行可视化处理的信息,请参阅 Chart Visualization。此文档以 Jupyter Notebook 的形式编写,可以 here 或下载。
This section demonstrates visualization of tabular data using the Styler class. For information on visualization with charting please see Chart Visualization. This document is written as a Jupyter Notebook, and can be viewed or downloaded here.
Styler Object and Customising the Display
样式和输出显示自定义设置应该在处理完 DataFrame 中的数据后再执行。如果进一步更改 DataFrame,Styler 将不会进行动态更新。DataFrame.style 属性是返回 Styler 对象的属性。它在其上定义了 repr_html 方法,以便在 Jupyter Notebook 中自动进行渲染。
Styling and output display customisation should be performed after the data in a DataFrame has been processed. The Styler is not dynamically updated if further changes to the DataFrame are made. The DataFrame.style attribute is a property that returns a Styler object. It has a repr_html method defined on it so it is rendered automatically in Jupyter Notebook.
Styler 适用于大数据,但最初是为小数据设计的,目前能够采用以下格式输出:
The Styler, which can be used for large data but is primarily designed for small data, currently has the ability to output to these formats:
-
HTML
-
LaTeX
-
String (and CSV by extension)
-
Excel
-
(JSON is not currently available)
前三种格式具有显示自定义方法,可对输出进行格式化和自定义。它们包括:
The first three of these have display customisation methods designed to format and customise the output. These include:
-
Formatting values, the index and columns headers, using .format() and .format_index(),
-
Renaming the index or column header labels, using .relabel_index()
-
Hiding certain columns, the index and/or column headers, or index names, using .hide()
-
Concatenating similar DataFrames, using .concat()
Formatting the Display
Formatting Values
Styler 在数据值和索引或列标题中区分了显示值和实际值。为控制显示值,文本以字符串形式打印在每个单元格中,我们可以使用 .format() 和 .format_index() 方法根据 format spec string 或获取单个值并返回字符串的可调用函数来对其进行操作。可以对整个表格、索引或对单个列或 MultiIndex 级别定义此内容。我们还可以覆盖索引名称。
The Styler distinguishes the display value from the actual value, in both data values and index or columns headers. To control the display value, the text is printed in each cell as a string, and we can use the .format() and .format_index() methods to manipulate this according to a format spec string or a callable that takes a single value and returns a string. It is possible to define this for the whole table, or index, or for individual columns, or MultiIndex levels. We can also overwrite index names.
此外,format 函数有一个 precision 参数,可专门帮助格式化浮点数,以及支持其他语言环境的小数和千位分隔符,一个 na_rep 参数用于显示缺失数据,以及一个 escape 和超链接参数,用于帮助显示安全的 HTML 或安全的 LaTeX。默认格式化程序配置为采用 pandas 的全局选项,例如 styler.format.precision 选项,可使用 with pd.option_context('format.precision', 2): 进行控制
Additionally, the format function has a precision argument to specifically help format floats, as well as decimal and thousands separators to support other locales, an na_rep argument to display missing data, and an escape and hyperlinks arguments to help displaying safe-HTML or safe-LaTeX. The default formatter is configured to adopt pandas’ global options such as styler.format.precision option, controllable using with pd.option_context('format.precision', 2):
[2]:
import pandas as pd
import numpy as np
import matplotlib as mpl
df = pd.DataFrame({
"strings": ["Adam", "Mike"],
"ints": [1, 3],
"floats": [1.123, 1000.23]
})
df.style \
.format(precision=3, thousands=".", decimal=",") \
.format_index(str.upper, axis=1) \
.relabel_index(["row 1", "row 2"], axis=0)
[2]:
使用 Styler 来操作显示是一个有用的特性,因为保持索引和数据值以用于其他目的可以提供更好的控制。你不需要覆盖 DataFrame 以按你喜欢的形式进行显示。以下是使用格式化函数的更全面的示例,同时仍然依赖底层数据进行索引和计算。
Using Styler to manipulate the display is a useful feature because maintaining the indexing and data values for other purposes gives greater control. You do not have to overwrite your DataFrame to display it how you like. Here is a more comprehensive example of using the formatting functions whilst still relying on the underlying data for indexing and calculations.
[3]:
weather_df = pd.DataFrame(np.random.rand(10,2)*5,
index=pd.date_range(start="2021-01-01", periods=10),
columns=["Tokyo", "Beijing"])
def rain_condition(v):
if v < 1.75:
return "Dry"
elif v < 2.75:
return "Rain"
return "Heavy Rain"
def make_pretty(styler):
styler.set_caption("Weather Conditions")
styler.format(rain_condition)
styler.format_index(lambda v: v.strftime("%A"))
styler.background_gradient(axis=None, vmin=1, vmax=5, cmap="YlGnBu")
return styler
weather_df
[3]:
[4]:
weather_df.loc["2021-01-04":"2021-01-08"].style.pipe(make_pretty)
[4]:
Hiding Data
可以完全隐藏索引和列标题,也可以隐藏人们希望排除的行或列。这两项操作使用相同的方法执行。
The index and column headers can be completely hidden, as well subselecting rows or columns that one wishes to exclude. Both these options are performed using the same methods.
可以通过不带任何参数调用 .hide() 来隐藏索引的呈现,如果你的索引是基于整数的,这可能很有用。同样,可以通过不带任何进一步参数调用 .hide(axis=”columns”) 来隐藏列标题。
The index can be hidden from rendering by calling .hide() without any arguments, which might be useful if your index is integer based. Similarly column headers can be hidden by calling .hide(axis=”columns”) without any further arguments.
可以通过调用相同 .hide() 方法并在 subset 参数中传入行/列标签、类似列表的对象或行/列标签的分片,来隐藏特定行或列的呈现。
Specific rows or columns can be hidden from rendering by calling the same .hide() method and passing in a row/column label, a list-like or a slice of row/column labels to for the subset argument.
隐藏不会更改 CSS 类的整数排列,例如,隐藏 DataFrame 的前两列意味着列类索引仍然从 col2 开始,因为 col0 和 col1 只是被忽略。
Hiding does not change the integer arrangement of CSS classes, e.g. hiding the first two columns of a DataFrame means the column class indexing will still start at col2, since col0 and col1 are simply ignored.
[5]:
df = pd.DataFrame(np.random.randn(5, 5))
df.style \
.hide(subset=[0, 2, 4], axis=0) \
.hide(subset=[0, 2, 4], axis=1)
[5]:
要将此函数反转为 show 功能,最佳做法是组合一个隐藏项的列表。
To invert the function to a show functionality it is best practice to compose a list of hidden items.
[6]:
show = [0, 2, 4]
df.style \
.hide([row for row in df.index if row not in show], axis=0) \
.hide([col for col in df.columns if col not in show], axis=1)
[6]:
Concatenating DataFrame Outputs
只要它们具有相同的列,就可以将两个或更多 Styler 连接在一起。这对于显示 DataFrame 的汇总统计数据非常有用,并且通常与 DataFrame.agg 结合使用。
Two or more Stylers can be concatenated together provided they share the same columns. This is very useful for showing summary statistics for a DataFrame, and is often used in combination with DataFrame.agg.
由于连接的对象是 Styler,因此它们可以独立设置样式,具体方法如下所示,并且它们的连接保留了这些样式。
Since the objects concatenated are Stylers they can independently be styled as will be shown below and their concatenation preserves those styles.
[7]:
summary_styler = df.agg(["sum", "mean"]).style \
.format(precision=3) \
.relabel_index(["Sum", "Average"])
df.style.format(precision=1).concat(summary_styler)
[7]:
Styler Object and HTML
Styler 最初是构建的,以支持各种 HTML 格式化选项。它的 HTML 输出创建了一个 HTML <table>,并利用 CSS 样式语言操作许多参数,包括颜色、字体、边框、背景等。有关 HTML 表格样式的详细信息,请参阅 here。这开箱即用允许了很多灵活性,甚至使 Web 开发人员能够将 DataFrames 集成到他们现有的用户界面设计中。
The Styler was originally constructed to support the wide array of HTML formatting options. Its HTML output creates an HTML <table> and leverages CSS styling language to manipulate many parameters including colors, fonts, borders, background, etc. See here for more information on styling HTML tables. This allows a lot of flexibility out of the box, and even enables web developers to integrate DataFrames into their exiting user interface designs.
以下我们演示默认输出,它看起来与标准 DataFrame HTML 表示非常相似。但此处的 HTML 已经将一些 CSS 类附加到每个单元格,即使我们尚未创建任何样式。我们可以通过调用 .to_html() 方法查看这些类,该方法以字符串形式返回原始 HTML,这对于进一步处理或添加到文件非常有用 - 继续阅读 More about CSS and HTML。本节还将全面介绍如何将此默认输出转换为表示更具沟通性的 DataFrame 输出。例如,我们如何构建 s:
Below we demonstrate the default output, which looks very similar to the standard DataFrame HTML representation. But the HTML here has already attached some CSS classes to each cell, even if we haven’t yet created any styles. We can view these by calling the .to_html() method, which returns the raw HTML as string, which is useful for further processing or adding to a file - read on in More about CSS and HTML. This section will also provide a walkthrough for how to convert this default output to represent a DataFrame output that is more communicative. For example how we can build s:
[8]:
df = pd.DataFrame([[38.0, 2.0, 18.0, 22.0, 21, np.nan],[19, 439, 6, 452, 226,232]],
index=pd.Index(['Tumour (Positive)', 'Non-Tumour (Negative)'], name='Actual Label:'),
columns=pd.MultiIndex.from_product([['Decision Tree', 'Regression', 'Random'],['Tumour', 'Non-Tumour']], names=['Model:', 'Predicted:']))
df.style
[8]:
[10]:
s
[10]:
我们采取的第一步是从 DataFrame 创建 Styler 对象,然后通过使用 .hide() 隐藏不需要的列来选择所需的范围。
The first step we have taken is the create the Styler object from the DataFrame and then select the range of interest by hiding unwanted columns with .hide().
[11]:
s = df.style.format('{:.0f}').hide([('Random', 'Tumour'), ('Random', 'Non-Tumour')], axis="columns")
s
[11]:
Methods to Add Styles
有 3 种主要方法可以将自定义 CSS 样式添加到 Styler:
There are 3 primary methods of adding custom CSS styles to Styler:
-
Using .set_table_styles() to control broader areas of the table with specified internal CSS. Although table styles allow the flexibility to add CSS selectors and properties controlling all individual parts of the table, they are unwieldy for individual cell specifications. Also, note that table styles cannot be exported to Excel.
-
Using .set_td_classes() to directly link either external CSS classes to your data cells or link the internal CSS classes created by .set_table_styles(). See here. These cannot be used on column header rows or indexes, and also won’t export to Excel.
-
Using the .apply() and .map() functions to add direct internal CSS to specific data cells. See here. As of v1.4.0 there are also methods that work directly on column header rows or indexes; .apply_index() and .map_index(). Note that only these methods add styles that will export to Excel. These methods work in a similar way to DataFrame.apply() and DataFrame.map().
Table Styles
表格样式足够灵活,可以控制表格的所有单独部分,包括列头和索引。但是,对于单独的数据单元格或任何类型的条件格式编排键入表格样式可能很麻烦,因此我们建议将表格样式用于广泛的样式编排,如一次处理整行或整列。
Table styles are flexible enough to control all individual parts of the table, including column headers and indexes. However, they can be unwieldy to type for individual data cells or for any kind of conditional formatting, so we recommend that table styles are used for broad styling, such as entire rows or columns at a time.
表格样式还用于控制可以一次应用于整个表格的功能,如创建通用悬停功能。:hover 伪选择器和其他伪选择器只能以这种方式使用。
Table styles are also used to control features which can apply to the whole table at once such as creating a generic hover functionality. The :hover pseudo-selector, as well as other pseudo-selectors, can only be used this way.
要复制 CSS 选择器和属性的常规格式(属性值对),例如:
To replicate the normal format of CSS selectors and properties (attribute value pairs), e.g.
tr:hover {
background-color: #ffff99;
}
将样式传递给 .set_table_styles() 的必要格式是列表化字典,每个字典具有 CSS 选择器标记和 CSS 属性。属性可以是 2 元组列表或常规 CSS 字符串,例如:
the necessary format to pass styles to .set_table_styles() is as a list of dicts, each with a CSS-selector tag and CSS-properties. Properties can either be a list of 2-tuples, or a regular CSS-string, for example:
[13]:
cell_hover = { # for row hover use <tr> instead of <td>
'selector': 'td:hover',
'props': [('background-color', '#ffffb3')]
}
index_names = {
'selector': '.index_name',
'props': 'font-style: italic; color: darkgrey; font-weight:normal;'
}
headers = {
'selector': 'th:not(.index_name)',
'props': 'background-color: #000066; color: white;'
}
s.set_table_styles([cell_hover, index_names, headers])
[13]:
接下来,我们只需添加另外两个针对表格特定部分的样式工件。在此务必谨慎,因为我们正在链接方法,我们需要明确指示方法不 overwrite 现有样式。
Next we just add a couple more styling artifacts targeting specific parts of the table. Be careful here, since we are chaining methods we need to explicitly instruct the method not to overwrite the existing styles.
[15]:
s.set_table_styles([
{'selector': 'th.col_heading', 'props': 'text-align: center;'},
{'selector': 'th.col_heading.level0', 'props': 'font-size: 1.5em;'},
{'selector': 'td', 'props': 'text-align: center; font-weight: bold;'},
], overwrite=False)
[15]:
作为一种便利方法(从版本 1.2.0 开始),我们还可以将字典传递给 .set_table_styles(),其中包含行或列键。在后台,Styler 只会对键编制索引,并在必要时将 .col<m> 或 .row<n> 相关类添加到给定的 CSS 选择器。
As a convenience method (since version 1.2.0) we can also pass a dict to .set_table_styles() which contains row or column keys. Behind the scenes Styler just indexes the keys and adds relevant .col<m> or .row<n> classes as necessary to the given CSS selectors.
[17]:
s.set_table_styles({
('Regression', 'Tumour'): [{'selector': 'th', 'props': 'border-left: 1px solid white'},
{'selector': 'td', 'props': 'border-left: 1px solid #000066'}]
}, overwrite=False, axis=0)
[17]:
Setting Classes and Linking to External CSS
如果您设计了一个网站,则很可能已经有一个外部 CSS 文件来控制其中的表格和单元格对象的样式编排。您可能希望使用这些本机文件,而不是在 Python 中重复所有 CSS(并重复任何维护工作)。
If you have designed a website then it is likely you will already have an external CSS file that controls the styling of table and cell objects within it. You may want to use these native files rather than duplicate all the CSS in python (and duplicate any maintenance work).
Table Attributes
使用 .set_table_attributes() 在主 <table> 中添加 class 非常容易。此方法还可以附加内联样式 - 在 CSS Hierarchies 中阅读更多内容。
It is very easy to add a class to the main <table> using .set_table_attributes(). This method can also attach inline styles - read more in CSS Hierarchies.
[19]:
out = s.set_table_attributes('class="my-table-cls"').to_html()
print(out[out.find('<table'):][:109])
<table id="T_xyz01" class="my-table-cls">
<thead>
<tr>
<th class="index_name level0" >Model:</th>
Data Cell CSS Classes
版本 1.2.0 中的新增功能
New in version 1.2.0
.set_td_classes() 方法接受具有与底层 Styler 的 DataFrame 相匹配的索引和列的 DataFrame。该 DataFrame 将包含用作 css 类的字符串,以添加到个别数据单元格:<table> 的 <td> 元素。我们不会使用外部 CSS,而是要在内部创建我们的类并将它们添加到表格样式。我们将保存添加 border 直到 section on tooltips。
The .set_td_classes() method accepts a DataFrame with matching indices and columns to the underlying Styler’s DataFrame. That DataFrame will contain strings as css-classes to add to individual data cells: the <td> elements of the <table>. Rather than use external CSS we will create our classes internally and add them to table style. We will save adding the borders until the section on tooltips.
[20]:
s.set_table_styles([ # create internal CSS classes
{'selector': '.true', 'props': 'background-color: #e6ffe6;'},
{'selector': '.false', 'props': 'background-color: #ffe6e6;'},
], overwrite=False)
cell_color = pd.DataFrame([['true ', 'false ', 'true ', 'false '],
['false ', 'true ', 'false ', 'true ']],
index=df.index,
columns=df.columns[:4])
s.set_td_classes(cell_color)
[20]:
Styler Functions
Acting on Data
我们使用以下方法来传递您的样式函数。这两种方法都接受一个函数(以及一些其他关键字自变量)并通过某种方式将其应用于 DataFrame,从而渲染 CSS 样式。
We use the following methods to pass your style functions. Both of those methods take a function (and some other keyword arguments) and apply it to the DataFrame in a certain way, rendering CSS styles.
-
.map() (elementwise): accepts a function that takes a single value and returns a string with the CSS attribute-value pair.
-
.apply() (column-/row-/table-wise): accepts a function that takes a Series or DataFrame and returns a Series, DataFrame, or numpy array with an identical shape where each element is a string with a CSS attribute-value pair. This method passes each column or row of your DataFrame one-at-a-time or the entire table at once, depending on the axis keyword argument. For columnwise use axis=0, rowwise use axis=1, and for the entire table at once use axis=None.
此方法有效地将多个复杂的逻辑应用于数据单元格。我们创建一个新的 DataFrame 来对此进行演示。
This method is powerful for applying multiple, complex logic to data cells. We create a new DataFrame to demonstrate this.
[22]:
np.random.seed(0)
df2 = pd.DataFrame(np.random.randn(10,4), columns=['A','B','C','D'])
df2.style
[22]:
例如,我们可以构建一个函数,如果文本为负数,则为文本着色,并将其与一个函数链接在一起,该函数会使可忽略值的部分单元格渐变。由于这依次查看每个元素,因此我们使用 map。
For example we can build a function that colors text if it is negative, and chain this with a function that partially fades cells of negligible value. Since this looks at each element in turn we use map.
[23]:
def style_negative(v, props=''):
return props if v < 0 else None
s2 = df2.style.map(style_negative, props='color:red;')\
.map(lambda v: 'opacity: 20%;' if (v < 0.3) and (v > -0.3) else None)
s2
[23]:
我们还可以构建一个函数,来一次突出显示行、列和 DataFrame 中的最大值。在这种情况下,我们使用 apply。以下我们突出显示列中的最大值。
We can also build a function that highlights the maximum value across rows, cols, and the DataFrame all at once. In this case we use apply. Below we highlight the maximum in a column.
[25]:
def highlight_max(s, props=''):
return np.where(s == np.nanmax(s.values), props, '')
s2.apply(highlight_max, props='color:white;background-color:darkblue', axis=0)
[25]:
我们可以在不同的轴上使用相同的函数,在此突出显示 DataFrame 中紫色表示的最大值,粉红色表示行中的最大值。
We can use the same function across the different axes, highlighting here the DataFrame maximum in purple, and row maximums in pink.
[27]:
s2.apply(highlight_max, props='color:white;background-color:pink;', axis=1)\
.apply(highlight_max, props='color:white;background-color:purple', axis=None)
[27]:
最后一个示例显示了一些样式如何被其他样式覆盖。一般而言,最后应用的样式处于活动状态,但您可以在 section on CSS hierarchies 中阅读更多内容。您还可以将这些样式应用到 DataFrame 的更多细粒度部分 - 在 subset slicing 的章节中阅读更多内容。
This last example shows how some styles have been overwritten by others. In general the most recent style applied is active but you can read more in the section on CSS hierarchies. You can also apply these styles to more granular parts of the DataFrame - read more in section on subset slicing.
仅使用类就可以复制此部分功能,但这更繁琐。参见 item 3) of Optimization
It is possible to replicate some of this functionality using just classes but it can be more cumbersome. See item 3) of Optimization
调试提示:如果您在编写样式函数时遇到困难,请尝试直接将它传递到 DataFrame.apply。在内部,Styler.apply 将使用 DataFrame.apply,因此结果应相同,并且使用 DataFrame.apply 您将能够在每个单元格中检查预定函数的 CSS 字符串输出。
Debugging Tip: If you’re having trouble writing your style function, try just passing it into DataFrame.apply. Internally, Styler.apply uses DataFrame.apply so the result should be the same, and with DataFrame.apply you will be able to inspect the CSS string output of your intended function in each cell.
Acting on the Index and Column Headers
使用以下方法可以实现针对页眉的类似应用程序:
Similar application is achieved for headers by using:
-
.map_index() (elementwise): accepts a function that takes a single value and returns a string with the CSS attribute-value pair.
-
.apply_index() (level-wise): accepts a function that takes a Series and returns a Series, or numpy array with an identical shape where each element is a string with a CSS attribute-value pair. This method passes each level of your Index one-at-a-time. To style the index use axis=0 and to style the column headers use axis=1.
您可以选择一个 level,不过目前没有类似的 subset 应用程序可用于这些方法。
You can select a level of a MultiIndex but currently no similar subset application is available for these methods.
[29]:
s2.map_index(lambda v: "color:pink;" if v>4 else "color:darkblue;", axis=0)
s2.apply_index(lambda s: np.where(s.isin(["A", "B"]), "color:pink;", "color:darkblue;"), axis=1)
[29]:
Tooltips and Captions
可以使用 .set_caption() 方法添加表格标题。您可以使用表格样式来控制与标题相关的 CSS。
Table captions can be added with the .set_caption() method. You can use table styles to control the CSS relevant to the caption.
[30]:
s.set_caption("Confusion matrix for multiple cancer prediction models.")\
.set_table_styles([{
'selector': 'caption',
'props': 'caption-side: bottom; font-size:1.25em;'
}], overwrite=False)
[30]:
添加工具提示(自版本 1.3.0 起)可以通过 .set_tooltips() 方法来完成,这种方法与通过提供具有相交索引和列的基于字符串的数据帧来向数据单元格添加 CSS 类相同。您不必指定 css_class 名称或任何工具提示的 css props,因为有标准默认值,但如果您希望有更多可视化控制,则此选项在这里。
Adding tooltips (since version 1.3.0) can be done using the .set_tooltips() method in the same way you can add CSS classes to data cells by providing a string based DataFrame with intersecting indices and columns. You don’t have to specify a css_class name or any css props for the tooltips, since there are standard defaults, but the option is there if you want more visual control.
[32]:
tt = pd.DataFrame([['This model has a very strong true positive rate',
"This model's total number of false negatives is too high"]],
index=['Tumour (Positive)'], columns=df.columns[[0,3]])
s.set_tooltips(tt, props='visibility: hidden; position: absolute; z-index: 1; border: 1px solid #000066;'
'background-color: white; color: #000066; font-size: 0.8em;'
'transform: translate(0px, -24px); padding: 0.6em; border-radius: 0.5em;')
[32]:
对于我们的表格,唯一剩下要做的就是添加高亮边框以吸引观众注意工具提示。我们将像以前使用表格样式那样创建内部 CSS 类。设置类总是覆盖,因此我们需要确保添加以前的类。
The only thing left to do for our table is to add the highlighting borders to draw the audience attention to the tooltips. We will create internal CSS classes as before using table styles. Setting classes always overwrites so we need to make sure we add the previous classes.
[34]:
s.set_table_styles([ # create internal CSS classes
{'selector': '.border-red', 'props': 'border: 2px dashed red;'},
{'selector': '.border-green', 'props': 'border: 2px dashed green;'},
], overwrite=False)
cell_border = pd.DataFrame([['border-green ', ' ', ' ', 'border-red '],
[' ', ' ', ' ', ' ']],
index=df.index,
columns=df.columns[:4])
s.set_td_classes(cell_color + cell_border)
[34]:
Finer Control with Slicing
我们到目前为止针对 Styler.apply 和 Styler.map 函数显示的示例尚未演示 subset 参数的使用。这是一个有用的参数,它允许很大的灵活性:它允许您对特定的行或列应用样式,而无需将该逻辑编码到 style 函数中。
The examples we have shown so far for the Styler.apply and Styler.map functions have not demonstrated the use of the subset argument. This is a useful argument which permits a lot of flexibility: it allows you to apply styles to specific rows or columns, without having to code that logic into your style function.
传递给 subset 的值的行为类似于切片数据帧;
The value passed to subset behaves similar to slicing a DataFrame;
-
A scalar is treated as a column label
-
A list (or Series or NumPy array) is treated as multiple column labels
-
A tuple is treated as (row_indexer, column_indexer)
考虑使用 pd.IndexSlice 构造最后一个元组。我们将创建一个多索引数据帧来演示该功能。
Consider using pd.IndexSlice to construct the tuple for the last one. We will create a MultiIndexed DataFrame to demonstrate the functionality.
[36]:
df3 = pd.DataFrame(np.random.randn(4,4),
pd.MultiIndex.from_product([['A', 'B'], ['r1', 'r2']]),
columns=['c1','c2','c3','c4'])
df3
[36]:
我们将使用子集以红色文本突出显示第三和第四列中的最大值。我们将以黄色突出显示子集切片区域。
We will use subset to highlight the maximum in the third and fourth columns with red text. We will highlight the subset sliced region in yellow.
[37]:
slice_ = ['c3', 'c4']
df3.style.apply(highlight_max, props='color:red;', axis=0, subset=slice_)\
.set_properties(**{'background-color': '#ffffb3'}, subset=slice_)
[37]:
如果按建议与 IndexSlice 相结合,那么它可以使用更大的灵活性在两个维度上进行索引。
If combined with the IndexSlice as suggested then it can index across both dimensions with greater flexibility.
[38]:
idx = pd.IndexSlice
slice_ = idx[idx[:,'r1'], idx['c2':'c4']]
df3.style.apply(highlight_max, props='color:red;', axis=0, subset=slice_)\
.set_properties(**{'background-color': '#ffffb3'}, subset=slice_)
[38]:
当与 axis=1 一起使用时,这也提供了对子行选择的灵活性。
This also provides the flexibility to sub select rows when used with the axis=1.
[39]:
slice_ = idx[idx[:,'r2'], :]
df3.style.apply(highlight_max, props='color:red;', axis=1, subset=slice_)\
.set_properties(**{'background-color': '#ffffb3'}, subset=slice_)
[39]:
还提供了提供条件过滤的范围。
There is also scope to provide conditional filtering.
假设我们仅在列 1 和列 3 之和小于 -2.0 的情况下(本质上是排除行 (:,'r2'))突出显示列 2 和列 4 的最大值。
Suppose we want to highlight the maximum across columns 2 and 4 only in the case that the sum of columns 1 and 3 is less than -2.0 (essentially excluding rows (:,'r2')).
[40]:
slice_ = idx[idx[(df3['c1'] + df3['c3']) < -2.0], ['c2', 'c4']]
df3.style.apply(highlight_max, props='color:red;', axis=1, subset=slice_)\
.set_properties(**{'background-color': '#ffffb3'}, subset=slice_)
[40]:
现在仅支持基于标签的切片,不支持按位置的切片,也不支持可调用切片。
Only label-based slicing is supported right now, not positional, and not callables.
如果您的样式函数使用 subset 或 axis 关键字参数,请考虑将您的函数包装在一个 functools.partial 中,将该关键字传递出去。
If your style function uses a subset or axis keyword argument, consider wrapping your function in a functools.partial, partialing out that keyword.
my_func2 = functools.partial(my_func, subset=42)
Optimization
通常,对于较小的表格和大多数情况,渲染的 HTML 无需优化,我们也不建议这样做。有两种情况值得考虑:
Generally, for smaller tables and most cases, the rendered HTML does not need to be optimized, and we don’t really recommend it. There are two cases where it is worth considering:
-
If you are rendering and styling a very large HTML table, certain browsers have performance issues.
-
If you are using Styler to dynamically create part of online user interfaces and want to improve network performance.
在此,我们推荐实施以下步骤:
Here we recommend the following steps to implement:
[[1.-Remove-UUID-and-cell_ids]]==== 1. 删除 UUID 和 cell_ids
[[1.-Remove-UUID-and-cell_ids]] ==== 1. Remove UUID and cell_ids
忽略 uuid 并将 cell_ids 设置为 False。这将防止不必要的 HTML。
Ignore the uuid and set cell_ids to False. This will prevent unnecessary HTML.
这是次优项:
This is sub-optimal:
[41]:
df4 = pd.DataFrame([[1,2],[3,4]])
s4 = df4.style
这是更好的:
This is better:
[42]:
from pandas.io.formats.style import Styler
s4 = Styler(df4, uuid_len=0, cell_ids=False)
[[2.-Use-table-styles]]==== 2. 使用表格样式
[[2.-Use-table-styles]] ==== 2. Use table styles
尽可能使用表格样式(例如,同时用于所有单元格或行或列),因为 CSS 几乎总是比其他格式更有效。
Use table styles where possible (e.g. for all cells or rows or columns at a time) since the CSS is nearly always more efficient than other formats.
这是次优项:
This is sub-optimal:
[43]:
props = 'font-family: "Times New Roman", Times, serif; color: #e83e8c; font-size:1.3em;'
df4.style.map(lambda x: props, subset=[1])
[43]:
这是更好的:
This is better:
[44]:
df4.style.set_table_styles([{'selector': 'td.col1', 'props': props}])
[44]:
[[3.-Set-classes-instead-of-using-Styler-functions]]==== 3. 设置类而不是使用 Styler 函数
[[3.-Set-classes-instead-of-using-Styler-functions]] ==== 3. Set classes instead of using Styler functions
对于应用相同样式到多个单元格的大型 DataFrame,声明样式为类然后将那些类应用到数据单元格可能更有效率,而不是直接将样式应用到单元格。然而,如果您不关心优化,则使用 Styler 函数 api 可能仍然更容易。
For large DataFrames where the same style is applied to many cells it can be more efficient to declare the styles as classes and then apply those classes to data cells, rather than directly applying styles to cells. It is, however, probably still easier to use the Styler function api when you are not concerned about optimization.
这是次优项:
This is sub-optimal:
[45]:
df2.style.apply(highlight_max, props='color:white;background-color:darkblue;', axis=0)\
.apply(highlight_max, props='color:white;background-color:pink;', axis=1)\
.apply(highlight_max, props='color:white;background-color:purple', axis=None)
[45]:
这是更好的:
This is better:
[46]:
build = lambda x: pd.DataFrame(x, index=df2.index, columns=df2.columns)
cls1 = build(df2.apply(highlight_max, props='cls-1 ', axis=0))
cls2 = build(df2.apply(highlight_max, props='cls-2 ', axis=1, result_type='expand').values)
cls3 = build(highlight_max(df2, props='cls-3 '))
df2.style.set_table_styles([
{'selector': '.cls-1', 'props': 'color:white;background-color:darkblue;'},
{'selector': '.cls-2', 'props': 'color:white;background-color:pink;'},
{'selector': '.cls-3', 'props': 'color:white;background-color:purple;'}
]).set_td_classes(cls1 + cls2 + cls3)
[46]:
[[4.-Don’t-use-tooltips]]==== 4. 不使用工具提示
[[4.-Don’t-use-tooltips]] ==== 4. Don’t use tooltips
工具提示需要 cell_ids 才能工作,并且会为每个数据单元格生成额外的 HTML 元素。
Tooltips require cell_ids to work and they generate extra HTML elements for every data cell.
[[5.-If-every-byte-counts-use-string-replacement]]==== 5. 如果每个字节都重要,请使用字符串替换
[[5.-If-every-byte-counts-use-string-replacement]] ==== 5. If every byte counts use string replacement
您可以通过替换默认 css 字典来移除不必要的 HTML 或缩短默认类名称。您可以在 below 中阅读有关 CSS 的更多信息。
You can remove unnecessary HTML, or shorten the default class names by replacing the default css dict. You can read a little more about CSS below.
[47]:
my_css = {
"row_heading": "",
"col_heading": "",
"index_name": "",
"col": "c",
"row": "r",
"col_trim": "",
"row_trim": "",
"level": "l",
"data": "",
"blank": "",
}
html = Styler(df4, uuid_len=0, cell_ids=False)
html.set_table_styles([{'selector': 'td', 'props': props},
{'selector': '.c1', 'props': 'color:green;'},
{'selector': '.l0', 'props': 'color:blue;'}],
css_class_names=my_css)
print(html.to_html())
<style type="text/css">
#T_ td {
font-family: "Times New Roman", Times, serif;
color: #e83e8c;
font-size: 1.3em;
}
#T_ .c1 {
color: green;
}
#T_ .l0 {
color: blue;
}
</style>
<table id="T_">
<thead>
<tr>
<th class=" l0" > </th>
<th class=" l0 c0" >0</th>
<th class=" l0 c1" >1</th>
</tr>
</thead>
<tbody>
<tr>
<th class=" l0 r0" >0</th>
<td class=" r0 c0" >1</td>
<td class=" r0 c1" >2</td>
</tr>
<tr>
<th class=" l0 r1" >1</th>
<td class=" r1 c0" >3</td>
<td class=" r1 c1" >4</td>
</tr>
</tbody>
</table>
[48]:
html
[48]:
Builtin Styles
某些样式函数非常常见,我们已将其“内置”到 Styler 中,因此您不必自己书写并应用它们。当前此类函数的列表:
Some styling functions are common enough that we’ve “built them in” to the Styler, so you don’t have to write them and apply them yourself. The current list of such functions is:
-
.highlight_null: for use with identifying missing data.
-
.highlight_min and .highlight_max: for use with identifying extremeties in data.
-
.highlight_between and .highlight_quantile: for use with identifying classes within data.
-
.background_gradient: a flexible method for highlighting cells based on their, or other, values on a numeric scale.
-
.text_gradient: similar method for highlighting text based on their, or other, values on a numeric scale.
-
.bar: to display mini-charts within cell backgrounds.
每个函数的单独文档通常会提供更多关于其参数的示例。
The individual documentation on each function often gives more examples of their arguments.
Highlight Null
[49]:
df2.iloc[0,2] = np.nan
df2.iloc[4,3] = np.nan
df2.loc[:4].style.highlight_null(color='yellow')
[49]:
Highlight Min or Max
[50]:
df2.loc[:4].style.highlight_max(axis=1, props='color:white; font-weight:bold; background-color:darkblue;')
[50]:
Highlight Between
此方法接受范围作为 float、NumPy 数组或 Series,前提是索引匹配。
This method accepts ranges as float, or NumPy arrays or Series provided the indexes match.
[51]:
left = pd.Series([1.0, 0.0, 1.0], index=["A", "B", "D"])
df2.loc[:4].style.highlight_between(left=left, right=1.5, axis=1, props='color:white; background-color:purple;')
[51]:
Highlight Quantile
可用于检测最高或最低百分位数的值
Useful for detecting the highest or lowest percentile values
[52]:
df2.loc[:4].style.highlight_quantile(q_left=0.85, axis=None, color='yellow')
[52]:
Background Gradient and Text Gradient
您可以使用 background_gradient 和 text_gradient 方法创建“热图”。这些方法需要 matplotlib,我们将使用 Seaborn 获取漂亮的色标。
You can create “heatmaps” with the background_gradient and text_gradient methods. These require matplotlib, and we’ll use Seaborn to get a nice colormap.
[53]:
import seaborn as sns
cm = sns.light_palette("green", as_cmap=True)
df2.style.background_gradient(cmap=cm)
[53]:
[54]:
df2.style.text_gradient(cmap=cm)
[54]:
.background_gradient 和 .text_gradient 有许多关键字参数来自定义渐变和颜色。请参阅文档。
.background_gradient and .text_gradient have a number of keyword arguments to customise the gradients and colors. See the documentation.
Set properties
当样式实际上不依赖于值时,请使用 Styler.set_properties。这只是 .map 的一个简单封装,其中函数对所有单元格返回相同的属性。
Use Styler.set_properties when the style doesn’t actually depend on the values. This is just a simple wrapper for .map where the function returns the same properties for all cells.
[55]:
df2.loc[:4].style.set_properties(**{'background-color': 'black',
'color': 'lawngreen',
'border-color': 'white'})
[55]:
Bar charts
您可以在 DataFrame 中包含“条形图”。
You can include “bar charts” in your DataFrame.
[56]:
df2.style.bar(subset=['A', 'B'], color='#d65f5f')
[56]:
其他关键字参数可以更好地控制居中和定位,您可以传递 [color_negative, color_positive] 列表来突出显示较低和较高值或 matplotlib 色标。
Additional keyword arguments give more control on centering and positioning, and you can pass a list of [color_negative, color_positive] to highlight lower and higher values or a matplotlib colormap.
为了展示一个示例,以下是您如何使用新的 align 选项更改上述内容的方法,与设置 vmin 和 vmax 限制、数字的 width 和单元格的底层 css props 相结合,留下显示文本和条形图的空间。我们还使用 text_gradient 对文本着色,使其与条形图相同,并使用 matplotlib 色标(尽管在这种情况下,可视化效果可能在没有此附加效果的情况下更好)。
To showcase an example here’s how you can change the above with the new align option, combined with setting vmin and vmax limits, the width of the figure, and underlying css props of cells, leaving space to display the text and the bars. We also use text_gradient to color the text the same as the bars using a matplotlib colormap (although in this case the visualization is probably better without this additional effect).
[57]:
df2.style.format('{:.3f}', na_rep="")\
.bar(align=0, vmin=-2.5, vmax=2.5, cmap="bwr", height=50,
width=60, props="width: 120px; border-right: 1px solid black;")\
.text_gradient(cmap="bwr", vmin=-2.5, vmax=2.5)
[57]:
以下示例旨在重点介绍新对齐选项的行为:
The following example aims to give a highlight of the behavior of the new align options:
[59]:
HTML(head)
[59]:
Sharing styles
比如说您为 DataFrame 建立了精美的样式,现在您想将相同的样式应用于第二个 DataFrame。使用 df1.style.export 导出样式,并使用 df1.style.set 将其导入第二个 DataFrame。
Say you have a lovely style built up for a DataFrame, and now you want to apply the same style to a second DataFrame. Export the style with df1.style.export, and import it on the second DataFrame with df1.style.set
[60]:
style1 = df2.style\
.map(style_negative, props='color:red;')\
.map(lambda v: 'opacity: 20%;' if (v < 0.3) and (v > -0.3) else None)\
.set_table_styles([{"selector": "th", "props": "color: blue;"}])\
.hide(axis="index")
style1
[60]:
[61]:
style2 = df3.style
style2.use(style1.export())
style2
[61]:
请注意,即使样式具有数据感知性,您也能共享它们。将在已对其进行 use 的新 DataFrame 上重新评估样式。
Notice that you’re able to share the styles even though they’re data aware. The styles are re-evaluated on the new DataFrame they’ve been _use_d upon.
Limitations
-
DataFrame only (use Series.to_frame().style)
-
The index and columns do not need to be unique, but certain styling functions can only work with unique indexes.
-
No large repr, and construction performance isn’t great; although we have some HTML optimizations
-
You can only apply styles, you can’t insert new HTML entities, except via subclassing.
Other Fun and Useful Stuff
以下是一些有趣的例子。
Here are a few interesting examples.
Widgets
Styler 与窗口小部件交互得非常好。如果您在网上查看而不是自己运行笔记本,您将错过交互式调整颜色选项。
Styler interacts pretty well with widgets. If you’re viewing this online instead of running the notebook yourself, you’re missing out on interactively adjusting the color palette.
[62]:
from ipywidgets import widgets
@widgets.interact
def f(h_neg=(0, 359, 1), h_pos=(0, 359), s=(0., 99.9), l=(0., 99.9)):
return df2.style.background_gradient(
cmap=sns.palettes.diverging_palette(h_neg=h_neg, h_pos=h_pos, s=s, l=l,
as_cmap=True)
)
Magnify
[63]:
def magnify():
return [dict(selector="th",
props=[("font-size", "4pt")]),
dict(selector="td",
props=[('padding', "0em 0em")]),
dict(selector="th:hover",
props=[("font-size", "12pt")]),
dict(selector="tr:hover td:hover",
props=[('max-width', '200px'),
('font-size', '12pt')])
]
[64]:
np.random.seed(25)
cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)
bigdf = pd.DataFrame(np.random.randn(20, 25)).cumsum()
bigdf.style.background_gradient(cmap, axis=1)\
.set_properties(**{'max-width': '80px', 'font-size': '1pt'})\
.set_caption("Hover to magnify")\
.format(precision=2)\
.set_table_styles(magnify())
[64]:
Sticky Headers
如果您在一个笔记本里显示一个大型的矩阵或 DataFrame,但是您始终想要看到列和行标题,您可以使用方法 .set_sticky 来操纵 table style CSS。
If you display a large matrix or DataFrame in a notebook, but you want to always see the column and row headers you can use the .set_sticky method which manipulates the table styles CSS.
[65]:
bigdf = pd.DataFrame(np.random.randn(16, 100))
bigdf.style.set_sticky(axis="index")
[65]:
粘贴多重索引甚至特定的层级也是可能的。
It is also possible to stick MultiIndexes and even only specific levels.
[66]:
bigdf.index = pd.MultiIndex.from_product([["A","B"],[0,1],[0,1,2,3]])
bigdf.style.set_sticky(axis="index", pixel_size=18, levels=[1,2])
[66]:
HTML Escaping
假设您不得不在 HTML 中显示 HTML,当渲染引擎无法区分时,这可能会有一点麻烦。您可以使用 escape 格式选项来处理这种情况,甚至在包含 HTML 本身的格式器中使用该选项。
Suppose you have to display HTML within HTML, that can be a bit of pain when the renderer can’t distinguish. You can use the escape formatting option to handle this, and even use it within a formatter that contains HTML itself.
[67]:
df4 = pd.DataFrame([['<div></div>', '"&other"', '<span></span>']])
df4.style
[67]:
[68]:
df4.style.format(escape="html")
[68]:
[69]:
df4.style.format('<a href="https://pandas.pydata.org" target="_blank">{}</a>', escape="html")
[69]:
Export to Excel
对于使用 OpenPyXL 或 XlsxWriter 引擎导出经过整理的 DataFrames 到 Excel 工作表,有一些支持(自版本 0.20.0 起)。CSS2.2 处理的属性包括:
Some support (since version 0.20.0) is available for exporting styled DataFrames to Excel worksheets using the OpenPyXL or XlsxWriter engines. CSS2.2 properties handled include:
-
background-color
-
border-style properties
-
border-width properties
-
border-color properties
-
color
-
font-family
-
font-style
-
font-weight
-
text-align
-
text-decoration
-
vertical-align
-
white-space: nowrap
-
Shorthand and side-specific border properties are supported (e.g. border-style and border-left-style) as well as the border shorthands for all sides (border: 1px solid green) or specified sides (border-left: 1px solid green). Using a border shorthand will override any border properties set before it (See CSS Working Group for more details)
-
Only CSS2 named colors and hex colors of the form #rgb or #rrggbb are currently supported.
-
The following pseudo CSS properties are also available to set Excel specific style properties:
-
number-format
-
border-style (for Excel-specific styles: “hair”, “mediumDashDot”, “dashDotDot”, “mediumDashDotDot”, “dashDot”, “slantDashDot”, or “mediumDashed”)
表格级别样式和数据单元格 CSS 类不包括在导出到 Excel 的内容中:各个单元格的属性必须通过 Styler.apply 和/或 Styler.map 方法进行映射。
Table level styles, and data cell CSS-classes are not included in the export to Excel: individual cells must have their properties mapped by the Styler.apply and/or Styler.map methods.
[70]:
df2.style.\
map(style_negative, props='color:red;').\
highlight_max(axis=0).\
to_excel('styled.xlsx', engine='openpyxl')
输出的屏幕截图:
A screenshot of the output:
Export to LaTeX
支持(自版本 1.3.0 起)导出到 LaTeX 的 Styler。 .to_latex 方法的文档提供了更多详细信息和大量示例。
There is support (since version 1.3.0) to export Styler to LaTeX. The documentation for the .to_latex method gives further detail and numerous examples.
More About CSS and HTML
级联样式表 (CSS) 语言旨在影响浏览器如何渲染 HTML 元素,它有其自己的特点。它从不报告错误:它只是默默地忽略它们,不会按照您的预期渲染您的对象,所以有时可能会令人沮丧。以下是 Styler 如何创建 HTML 和与 CSS 交互的非常简短的入门,并提供了有关如何避免常见问题的建议。
Cascading Style Sheet (CSS) language, which is designed to influence how a browser renders HTML elements, has its own peculiarities. It never reports errors: it just silently ignores them and doesn’t render your objects how you intend so can sometimes be frustrating. Here is a very brief primer on how Styler creates HTML and interacts with CSS, with advice on common pitfalls to avoid.
CSS Classes and Ids
附加到每个单元格的 CSS class 的精确结构如下。
The precise structure of the CSS class attached to each cell is as follows.
-
Cells with Index and Column names include index_name and level<k> where k is its level in a MultiIndex
-
Index label cells include
-
row_heading
-
level<k> where k is the level in a MultiIndex
-
row<m> where m is the numeric position of the row
-
Column label cells include
-
col_heading
-
level<k> where k is the level in a MultiIndex
-
col<n> where n is the numeric position of the column
-
Data cells include
-
data
-
row<m>, where m is the numeric position of the cell.
-
col<n>, where n is the numeric position of the cell.
-
Blank cells include blank
-
Trimmed cells include col_trim or row_trim
id 的结构是 T_uuid_level<k>_row<m>_col<n>,其中 level<k> 仅用于标题,标题只包含 row<m> 或 col<n>(以需要为准)。默认情况下,我们还给每行/列标识符添加了一个唯一于每个 DataFrame 的 UUID,以便各个标识符的样式不会在同一笔记本或页面中发生冲突。你可以了解有关在 Optimization 中使用 UUID 的更多信息。
The structure of the id is T_uuid_level<k>_row<m>_col<n> where level<k> is used only on headings, and headings will only have either row<m> or col<n> whichever is needed. By default we’ve also prepended each row/column identifier with a UUID unique to each DataFrame so that the style from one doesn’t collide with the styling from another within the same notebook or page. You can read more about the use of UUIDs in Optimization.
我们可以通过调用 .to_html() 方法来查看 HTML 示例。
We can see example of the HTML by calling the .to_html() method.
[71]:
print(pd.DataFrame([[1,2],[3,4]], index=['i1', 'i2'], columns=['c1', 'c2']).style.to_html())
<style type="text/css">
</style>
<table id="T_a1de3">
<thead>
<tr>
<th class="blank level0" > </th>
<th id="T_a1de3_level0_col0" class="col_heading level0 col0" >c1</th>
<th id="T_a1de3_level0_col1" class="col_heading level0 col1" >c2</th>
</tr>
</thead>
<tbody>
<tr>
<th id="T_a1de3_level0_row0" class="row_heading level0 row0" >i1</th>
<td id="T_a1de3_row0_col0" class="data row0 col0" >1</td>
<td id="T_a1de3_row0_col1" class="data row0 col1" >2</td>
</tr>
<tr>
<th id="T_a1de3_level0_row1" class="row_heading level0 row1" >i2</th>
<td id="T_a1de3_row1_col0" class="data row1 col0" >3</td>
<td id="T_a1de3_row1_col1" class="data row1 col1" >4</td>
</tr>
</tbody>
</table>
CSS Hierarchies
示例表明:当 CSS 样式重叠时,在 HTML 渲染中,最后出现的样式优先。因此,以下内容产生的结果不同:
The examples have shown that when CSS styles overlap, the one that comes last in the HTML render, takes precedence. So the following yield different results:
[72]:
df4 = pd.DataFrame([['text']])
df4.style.map(lambda x: 'color:green;')\
.map(lambda x: 'color:red;')
[72]:
[73]:
df4.style.map(lambda x: 'color:red;')\
.map(lambda x: 'color:green;')
[73]:
这仅适用于层级或重要性相当的 CSS 规则。你可以了解有关 CSS specificity here 的更多信息,但针对我们的目的,总结要点就足够了:
This is only true for CSS rules that are equivalent in hierarchy, or importance. You can read more about CSS specificity here but for our purposes it suffices to summarize the key points:
每个 HTML 元素的 CSS 重要性分数从 0 开始,并添加以下内容:
A CSS importance score for each HTML element is derived by starting at zero and adding:
-
1000 for an inline style attribute
-
100 for each ID
-
10 for each attribute, class or pseudo-class
-
1 for each element name or pseudo-element
我们使用此信息描述以下配置的作用
Let’s use this to describe the action of the following configurations
[74]:
df4.style.set_uuid('a_')\
.set_table_styles([{'selector': 'td', 'props': 'color:red;'}])\
.map(lambda x: 'color:green;')
[74]:
此文本为红色,因为生成的选取器 #T_a_ td 值为 101(ID 加元素),而 #T_a_row0_col0 值仅为 100(ID),因此被认为较低,即使在 HTML 中它出现在前面。
This text is red because the generated selector #T_a_ td is worth 101 (ID plus element), whereas #T_a_row0_col0 is only worth 100 (ID), so is considered inferior even though in the HTML it comes after the previous.
[75]:
df4.style.set_uuid('b_')\
.set_table_styles([{'selector': 'td', 'props': 'color:red;'},
{'selector': '.cls-1', 'props': 'color:blue;'}])\
.map(lambda x: 'color:green;')\
.set_td_classes(pd.DataFrame([['cls-1']]))
[75]:
在上述情况下,文本为蓝色,因为选取器 #T_b_ .cls-1 值为 110(ID 加类),优先。
In the above case the text is blue because the selector #T_b_ .cls-1 is worth 110 (ID plus class), which takes precedence.
[76]:
df4.style.set_uuid('c_')\
.set_table_styles([{'selector': 'td', 'props': 'color:red;'},
{'selector': '.cls-1', 'props': 'color:blue;'},
{'selector': 'td.data', 'props': 'color:yellow;'}])\
.map(lambda x: 'color:green;')\
.set_td_classes(pd.DataFrame([['cls-1']]))
[76]:
现在我们创建了另一个表样式,这一次选取器 T_c_ td.data(ID 加元素加类)升至 111。
Now we have created another table style this time the selector T_c_ td.data (ID plus element plus class) gets bumped up to 111.
如果你的样式未能应用,并且真的很令人沮丧,请尝试 !important 王牌。
If your style fails to be applied, and its really frustrating, try the !important trump card.
[77]:
df4.style.set_uuid('d_')\
.set_table_styles([{'selector': 'td', 'props': 'color:red;'},
{'selector': '.cls-1', 'props': 'color:blue;'},
{'selector': 'td.data', 'props': 'color:yellow;'}])\
.map(lambda x: 'color:green !important;')\
.set_td_classes(pd.DataFrame([['cls-1']]))
[77]:
终于得到那条绿色的文本了!
Finally got that green text after all!
Extensibility
panda 的核心是,并且将继续是它的“高性能、易于使用的数据结构”。考虑到这一点,我们希望 DataFrame.style 完成两个目标:
The core of pandas is, and will remain, its “high-performance, easy-to-use data structures”. With that in mind, we hope that DataFrame.style accomplishes two goals
-
Provide an API that is pleasing to use interactively and is “good enough” for many tasks
-
Provide the foundations for dedicated libraries to build on
如果你在此基础上构建一个了不起的库,请告诉我们,我们会 link。
If you build a great library on top of this, let us know and we’ll link to it.
Subclassing
如果默认模板不完全满足你的需求,你可以对 Styler 进行子类化并扩展或覆盖模板。我们将展示一个扩展默认模板的示例,在每个表格之前插入一个自定义头。
If the default template doesn’t quite suit your needs, you can subclass Styler and extend or override the template. We’ll show an example of extending the default template to insert a custom header before each table.
[78]:
from jinja2 import Environment, ChoiceLoader, FileSystemLoader
from IPython.display import HTML
from pandas.io.formats.style import Styler
我们将使用以下模板:
We’ll use the following template:
[79]:
with open("templates/myhtml.tpl") as f:
print(f.read())
{% extends "html_table.tpl" %}
{% block table %}
<h1>{{ table_title|default("My Table") }}</h1>
{{ super() }}
{% endblock table %}
现在我们已经创建了一个模板,我们需要设置一个了解它的 Styler 子类。
Now that we’ve created a template, we need to set up a subclass of Styler that knows about it.
[80]:
class MyStyler(Styler):
env = Environment(
loader=ChoiceLoader([
FileSystemLoader("templates"), # contains ours
Styler.loader, # the default
])
)
template_html_table = env.get_template("myhtml.tpl")
请注意,我们在环境的加载器中包含了原始加载器。这是因为我们扩展了原始模板,因此 Jinja 环境需要能够找到它。
Notice that we include the original loader in our environment’s loader. That’s because we extend the original template, so the Jinja environment needs to be able to find it.
现在我们可以使用自定义样式器了。它 init 需要一个 DataFrame。
Now we can use that custom styler. It’s init takes a DataFrame.
[81]:
MyStyler(df3)
[81]:
My Table
我们的自定义模板接受一个 table_title 关键字。我们可以在 .to_html 方法中提供值。
Our custom template accepts a table_title keyword. We can provide the value in the .to_html method.
[82]:
HTML(MyStyler(df3).to_html(table_title="Extending Example"))
[82]:
Extending Example
为了方便起见,我们提供了 Styler.from_custom_template 方法,它执行与自定义子类相同的功能。
For convenience, we provide the Styler.from_custom_template method that does the same as the custom subclass.
[83]:
EasyStyler = Styler.from_custom_template("templates", "myhtml.tpl")
HTML(EasyStyler(df3).to_html(table_title="Another Title"))
[83]:
Another Title
以下是样式生成模板和表格生成模板的模板结构:
Here’s the template structure for the both the style generation template and the table generation template:
样式模板:
Style template:
[85]:
HTML(style_structure)
[85]:
<style type="text/css">
</style>
表格模板:
Table template:
[87]:
HTML(table_structure)
[87]:
<table ...>
</table>
有关更多详细信息,请参阅 GitHub repo中的模板。
See the template in the GitHub repo for more details.