Postgresql 中文操作指南

9.22. Window Functions #

Window functions 能够对与当前查询行相关的行集执行计算。有关此功能的简介,请参阅 Section 3.5;有关语法详细信息,请参阅 Section 4.2.8

Window functions provide the ability to perform calculations across sets of rows that are related to the current query row. See Section 3.5 for an introduction to this feature, and Section 4.2.8 for syntax details.

内置窗口函数列在 Table 9.64 中。请注意,must 这些函数必须使用窗口函数语法调用,即,需要 OVER 子句。

The built-in window functions are listed in Table 9.64. Note that these functions must be invoked using window function syntax, i.e., an OVER clause is required.

除了这些函数之外,还可以将任何内置或用户定义的普通聚合(即,不是有序集或假设集聚合)用作窗口函数;有关内置聚合的列表,请参阅 Section 9.21。仅当在调用后紧跟 OVER 子句时,聚合函数才作为窗口函数;否则,它们会作为普通聚合并为整个集合返回单行。

In addition to these functions, any built-in or user-defined ordinary aggregate (i.e., not ordered-set or hypothetical-set aggregates) can be used as a window function; see Section 9.21 for a list of the built-in aggregates. Aggregate functions act as window functions only when an OVER clause follows the call; otherwise they act as plain aggregates and return a single row for the entire set.

Table 9.64. General-Purpose Window Functions

Function

Description

row_number () → bigint

Returns the number of the current row within its partition, counting from 1.

rank () → bigint

Returns the rank of the current row, with gaps; that is, the row_number of the first row in its peer group.

dense_rank () → bigint

Returns the rank of the current row, without gaps; this function effectively counts peer groups.

percent_rank () → double precision

Returns the relative rank of the current row, that is (rank - 1) / (total partition rows - 1). The value thus ranges from 0 to 1 inclusive.

cume_dist () → double precision

Returns the cumulative distribution, that is (number of partition rows preceding or peers with current row) / (total partition rows). The value thus ranges from 1/N to 1.

ntile ( num_buckets integer ) → integer

Returns an integer ranging from 1 to the argument value, dividing the partition as equally as possible.

lag ( value anycompatible [, offset integer [, default anycompatible ]] ) → anycompatible

Returns value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead returns default (which must be of a type compatible with value). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to NULL.

lead ( value anycompatible [, offset integer [, default anycompatible ]] ) → anycompatible

Returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead returns default (which must be of a type compatible with value). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to NULL.

first_value ( value anyelement ) → anyelement

Returns value evaluated at the row that is the first row of the window frame.

last_value ( value anyelement ) → anyelement

Returns value evaluated at the row that is the last row of the window frame.

nth_value ( value anyelement, n integer ) → anyelement

Returns value evaluated at the row that is the n'th row of the window frame (counting from 1); returns NULL if there is no such row.

Table 9.64 中列出的所有函数都取决于关联窗口定义的 ORDER BY 子句指定的排序顺序。如果只考虑 ORDER BY 列,则无法区分的行称为 peers。四个排名函数(包括 cume_dist)的定义方式使得它们对同行组的所有行给出相同的答案。

All of the functions listed in Table 9.64 depend on the sort ordering specified by the ORDER BY clause of the associated window definition. Rows that are not distinct when considering only the ORDER BY columns are said to be peers. The four ranking functions (including cume_dist) are defined so that they give the same answer for all rows of a peer group.

请注意,first_valuelast_valuenth_value 只考虑“窗口帧”内的行,默认情况下,这包含从分区开始到当前行的最后一个同级行的所有行。这可能会导致 last_value 和有时会 nth_value 产生无用的结果。你可以通过向 OVER 子句添加合适的帧规范(RANGEROWSGROUPS)来重新定义帧。有关帧规范的更多信息,请参阅 Section 4.2.8

Note that first_value, last_value, and nth_value consider only the rows within the “window frame”, which by default contains the rows from the start of the partition through the last peer of the current row. This is likely to give unhelpful results for last_value and sometimes also nth_value. You can redefine the frame by adding a suitable frame specification (RANGE, ROWS or GROUPS) to the OVER clause. See Section 4.2.8 for more information about frame specifications.

当聚合函数用作窗口函数时,它将在当前行的窗口框架中对行进行聚合。使用 ORDER BY 和默认窗口框架定义的聚合将产生一种“移动和”的行为,这可能不是期望的结果。如果要对整个分区进行聚合,请省略 ORDER BY 或使用 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING。可以使用其他框架规范来获得其他效果。

When an aggregate function is used as a window function, it aggregates over the rows within the current row’s window frame. An aggregate used with ORDER BY and the default window frame definition produces a “running sum” type of behavior, which may or may not be what’s wanted. To obtain aggregation over the whole partition, omit ORDER BY or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Other frame specifications can be used to obtain other effects.

Note

SQL 标准定义了 RESPECT NULLSIGNORE NULLS 选项,用于 leadlagfirst_valuelast_valuenth_value。这没有在 PostgreSQL 中实现:行为始终与标准的默认行为相同,即 RESPECT NULLS。同样,标准的 FROM FIRSTFROM LAST 选项对于 nth_value 并未实现:仅支持默认 FROM FIRST 行为。(你可以通过反转 ORDER BY 顺序来实现 FROM LAST 的结果。)

The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, lag, first_value, last_value, and nth_value. This is not implemented in PostgreSQL: the behavior is always the same as the standard’s default, namely RESPECT NULLS. Likewise, the standard’s FROM FIRST or FROM LAST option for nth_value is not implemented: only the default FROM FIRST behavior is supported. (You can achieve the result of FROM LAST by reversing the ORDER BY ordering.)