Postgresql 中文操作指南

SELECT

SELECT、TABLE、WITH — 从表或视图中检索行

SELECT, TABLE, WITH — retrieve rows from a table or view

Synopsis

[ WITH [ RECURSIVE ] with_query [, ...] ]
SELECT [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
    [ * | expression [ [ AS ] output_name ] [, ...] ]
    [ FROM from_item [, ...] ]
    [ WHERE condition ]
    [ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
    [ HAVING condition ]
    [ WINDOW window_name AS ( window_definition ) [, ...] ]
    [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
    [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
    [ LIMIT { count | ALL } ]
    [ OFFSET start [ ROW | ROWS ] ]
    [ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES } ]
    [ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]

where from_item can be one of:

    [ ONLY ] table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
                [ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ]
    [ LATERAL ] ( select ) [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    [ LATERAL ] function_name ( [ argument [, ...] ] )
                [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    [ LATERAL ] function_name ( [ argument [, ...] ] ) [ AS ] alias ( column_definition [, ...] )
    [ LATERAL ] function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )
    [ LATERAL ] ROWS FROM( function_name ( [ argument [, ...] ] ) [ AS ( column_definition [, ...] ) ] [, ...] )
                [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
    from_item join_type from_item { ON join_condition | USING ( join_column [, ...] ) [ AS join_using_alias ] }
    from_item NATURAL join_type from_item
    from_item CROSS JOIN from_item

and grouping_element can be one of:

    ( )
    expression
    ( expression [, ...] )
    ROLLUP ( { expression | ( expression [, ...] ) } [, ...] )
    CUBE ( { expression | ( expression [, ...] ) } [, ...] )
    GROUPING SETS ( grouping_element [, ...] )

and with_query is:

    with_query_name [ ( column_name [, ...] ) ] AS [ [ NOT ] MATERIALIZED ] ( select | values | insert | update | delete )
        [ SEARCH { BREADTH | DEPTH } FIRST BY column_name [, ...] SET search_seq_col_name ]
        [ CYCLE column_name [, ...] SET cycle_mark_col_name [ TO cycle_mark_value DEFAULT cycle_mark_default ] USING cycle_path_col_name ]

TABLE [ ONLY ] table_name [ * ]

Description

SELECT 检索来自零个或多个表的行。 SELECT 的常规处理如下:

SELECT retrieves rows from zero or more tables. The general processing of SELECT is as follows:

SELECT 命令里使用过的每列都必须有 SELECT 权限。若要使用 FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE ,还需要 UPDATE 权限(至少需要所选的每个表的某个列)。

You must have SELECT privilege on each column used in a SELECT command. The use of FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE or FOR KEY SHARE requires UPDATE privilege as well (for at least one column of each table so selected).

Parameters

WITH Clause

WITH 子句可用于指定一个或多个子查询,这些子查询可以由主查询中的名称引用。这些子查询实际上在主查询执行期间作为临时表或视图。每个子查询可以是 SELECTTABLEVALUESINSERTUPDATEDELETE 语句。在 WITH 中编写数据修改语句( INSERTUPDATEDELETE )时,通常会包含 RETURNING 子句。此语句的输出是 RETURNINGnot ,该语句修改的基础表,形成主查询读取的临时表。如果 RETURNING 被省略,则语句仍然执行,但它不会产生输出,因此它不能被主查询引用为表。

The WITH clause allows you to specify one or more subqueries that can be referenced by name in the primary query. The subqueries effectively act as temporary tables or views for the duration of the primary query. Each subquery can be a SELECT, TABLE, VALUES, INSERT, UPDATE or DELETE statement. When writing a data-modifying statement (INSERT, UPDATE or DELETE) in WITH, it is usual to include a RETURNING clause. It is the output of RETURNING, not the underlying table that the statement modifies, that forms the temporary table that is read by the primary query. If RETURNING is omitted, the statement is still executed, but it produces no output so it cannot be referenced as a table by the primary query.

必须为每个 WITH 查询指定一个名称(无模式限定)。可以选择地,可以指定一个列名列表;如果省略,则列名从子查询中进行推断。

A name (without schema qualification) must be specified for each WITH query. Optionally, a list of column names can be specified; if this is omitted, the column names are inferred from the subquery.

如果 RECURSIVE 被指定,则允许 SELECT 子查询以名称引用自身。此类子查询必须具有以下格式

If RECURSIVE is specified, it allows a SELECT subquery to reference itself by name. Such a subquery must have the form

non_recursive_term UNION [ ALL | DISTINCT ] recursive_term

递归自引用必须出现在 UNION 的右侧。每个查询仅允许一个递归自引用。递归数据修改语句不受支持,但可以在数据修改语句中使用递归 SELECT 查询的结果。有关示例,请参阅 Section 7.8

where the recursive self-reference must appear on the right-hand side of the UNION. Only one recursive self-reference is permitted per query. Recursive data-modifying statements are not supported, but you can use the results of a recursive SELECT query in a data-modifying statement. See Section 7.8 for an example.

RECURSIVE 的另一个效果是 WITH 查询不必有序:查询可以引用列表中后面的另一个查询。(但是,没有实现循环引用或互递归。)没有 RECURSIVEWITH 查询只能引用 WITH 列表前面的是兄弟 WITH 查询。

Another effect of RECURSIVE is that WITH queries need not be ordered: a query can reference another one that is later in the list. (However, circular references, or mutual recursion, are not implemented.) Without RECURSIVE, WITH queries can only reference sibling WITH queries that are earlier in the WITH list.

WITH 子句中有多个查询时, RECURSIVE 应该只在一处编写,紧跟在 WITH 之后。尽管它对不使用递归或前向引用查询没有影响,但它适用于 WITH 子句中的所有查询。

When there are multiple queries in the WITH clause, RECURSIVE should be written only once, immediately after WITH. It applies to all queries in the WITH clause, though it has no effect on queries that do not use recursion or forward references.

可选 SEARCH 子句计算 search sequence column ,可用于按广度优先顺序或深度优先顺序对递归查询结果进行排序。所提供的列名列表指定用于跟踪已访问行的行键。名为 search_seq_col_name 的列将被添加到 WITH 查询的结果列列表中。此列可以在外部查询中按顺序排列以实现各个顺序。有关示例,请参阅 Section 7.8.2.1

The optional SEARCH clause computes a search sequence column that can be used for ordering the results of a recursive query in either breadth-first or depth-first order. The supplied column name list specifies the row key that is to be used for keeping track of visited rows. A column named search_seq_col_name will be added to the result column list of the WITH query. This column can be ordered by in the outer query to achieve the respective ordering. See Section 7.8.2.1 for examples.

可选 CYCLE 子句用于检测递归查询中的循环。所提供的列名列表指定用于跟踪已访问行的行键。名为 cycle_mark_col_name 的列将被添加到 WITH 查询的结果列列表中。当检测到循环时此列将设为 cycle_mark_value ,否则设为 cycle_mark_default 。此外,当检测到循环时递归并集的处理将停止。 cycle_mark_valuecycle_mark_default 必须是常量,必须可以强制转换为公共数据类型,并且数据类型必须具有不等式运算符。(SQL 标准要求它们是布尔常量或字符字符串,但 PostgreSQL 不这么要求。)默认情况下,使用 TRUEFALSE (类型 boolean )。此外,名为 cycle_path_col_name 的列将被添加到 WITH 查询的结果列列表中。此内部用于跟踪已访问的行。有关示例,请参阅 Section 7.8.2.2

The optional CYCLE clause is used to detect cycles in recursive queries. The supplied column name list specifies the row key that is to be used for keeping track of visited rows. A column named cycle_mark_col_name will be added to the result column list of the WITH query. This column will be set to cycle_mark_value when a cycle has been detected, else to cycle_mark_default. Furthermore, processing of the recursive union will stop when a cycle has been detected. cycle_mark_value and cycle_mark_default must be constants and they must be coercible to a common data type, and the data type must have an inequality operator. (The SQL standard requires that they be Boolean constants or character strings, but PostgreSQL does not require that.) By default, TRUE and FALSE (of type boolean) are used. Furthermore, a column named cycle_path_col_name will be added to the result column list of the WITH query. This column is used internally for tracking visited rows. See Section 7.8.2.2 for examples.

SEARCHCYCLE 子句都仅对递归 WITH 查询有效。 with_query 必须是两个 SELECT (或同等)命令的 UNION (或 UNION ALL )(在 CYCLE 子句添加的列之前,没有出现嵌套 UNION_s). If both clauses are used, the column added by the _SEARCH 子句)。

Both the SEARCH and the CYCLE clause are only valid for recursive WITH queries. The with_query must be a UNION (or UNION ALL) of two SELECT (or equivalent) commands (no nested UNION_s). If both clauses are used, the column added by the _SEARCH clause appears before the columns added by the CYCLE clause.

主查询和 WITH 查询都是在(概念上)在主查询执行时同时执行。这意味着 WITH 中的数据修改语句的影响无法从查询的其他部分看到,除了读取其 RETURNING 输出。如果两个此类数据修改语句尝试修改同一行,则结果不确定。

The primary query and the WITH queries are all (notionally) executed at the same time. This implies that the effects of a data-modifying statement in WITH cannot be seen from other parts of the query, other than by reading its RETURNING output. If two such data-modifying statements attempt to modify the same row, the results are unspecified.

WITH 查询的一个关键属性是,它们通常每个主查询执行只评估一次,即使主查询引用它们不止一次。特别是,数据修改语句保证只执行一次,无论主查询是否读取其全部或任何输出。

A key property of WITH queries is that they are normally evaluated only once per execution of the primary query, even if the primary query refers to them more than once. In particular, data-modifying statements are guaranteed to be executed once and only once, regardless of whether the primary query reads all or any of their output.

但是,可以将 WITH 查询标记为 NOT MATERIALIZED 以删除此保证。在这种情况下, WITH 查询可以折叠到主查询中,就像它是主查询的 FROM 子句中的简单子 SELECT 一样。如果主查询引用 WITH 查询不止一次,则会导致重复计算;但是,如果每个此类用法只需要 WITH 查询的总输出的几行, NOT MATERIALIZED 可以通过允许对查询进行联合优化来提供净收益。如果 NOT MATERIALIZED 附加到递归 WITH 查询或有副作用(即不是不包含任何不稳定函数的纯 SELECT ),则会忽略它。

However, a WITH query can be marked NOT MATERIALIZED to remove this guarantee. In that case, the WITH query can be folded into the primary query much as though it were a simple sub-SELECT in the primary query’s FROM clause. This results in duplicate computations if the primary query refers to that WITH query more than once; but if each such use requires only a few rows of the WITH query’s total output, NOT MATERIALIZED can provide a net savings by allowing the queries to be optimized jointly. NOT MATERIALIZED is ignored if it is attached to a WITH query that is recursive or is not side-effect-free (i.e., is not a plain SELECT containing no volatile functions).

默认情况下,如果在主查询的 FROM 子句中只使用一次无副作用的 WITH 查询,则将其折叠到主查询中。这允许在语义上应该不可见的情况下对两个查询级别进行联合优化。但是,可以通过将 WITH 查询标记为 MATERIALIZED 来防止此类折叠。例如,如果 WITH 查询用作优化限定符来防止计划程序选择错误的计划,则这可能很有用。v12 之前的 PostgreSQL 版本从未进行过此类折叠,因此为旧版本编写的查询可能依靠 WITH 作为优化限定符。

By default, a side-effect-free WITH query is folded into the primary query if it is used exactly once in the primary query’s FROM clause. This allows joint optimization of the two query levels in situations where that should be semantically invisible. However, such folding can be prevented by marking the WITH query as MATERIALIZED. That might be useful, for example, if the WITH query is being used as an optimization fence to prevent the planner from choosing a bad plan. PostgreSQL versions before v12 never did such folding, so queries written for older versions might rely on WITH to act as an optimization fence.

请参阅 Section 7.8 以获取更多信息。

See Section 7.8 for additional information.

FROM Clause

FROM 子句为 SELECT 指定一个或多个源表。如果指定了多个源,则结果是所有源的笛卡尔积(交叉连接)。但是,通常会添加限定条件(通过 WHERE )将返回行限制为笛卡尔积的一个小子集。

The FROM clause specifies one or more source tables for the SELECT. If multiple sources are specified, the result is the Cartesian product (cross join) of all the sources. But usually qualification conditions are added (via WHERE) to restrict the returned rows to a small subset of the Cartesian product.

FROM 子句可以包含以下元素:

The FROM clause can contain the following elements:

  • table_name

    • The name (optionally schema-qualified) of an existing table or view. If ONLY is specified before the table name, only that table is scanned. If ONLY is not specified, the table and all its descendant tables (if any) are scanned. Optionally, * can be specified after the table name to explicitly indicate that descendant tables are included.

  • alias

    • A substitute name for the FROM item containing the alias. An alias is used for brevity or to eliminate ambiguity for self-joins (where the same table is scanned multiple times). When an alias is provided, it completely hides the actual name of the table or function; for example given FROM foo AS f, the remainder of the SELECT must refer to this FROM item as f not foo. If an alias is written, a column alias list can also be written to provide substitute names for one or more columns of the table.

  • TABLESAMPLE _sampling_method ( argument [, …​] ) [ REPEATABLE ( seed ) ]_

    • A TABLESAMPLE clause after a table_name indicates that the specified sampling_method should be used to retrieve a subset of the rows in that table. This sampling precedes the application of any other filters such as WHERE clauses. The standard PostgreSQL distribution includes two sampling methods, BERNOULLI and SYSTEM, and other sampling methods can be installed in the database via extensions.

    • The BERNOULLI and SYSTEM sampling methods each accept a single argument which is the fraction of the table to sample, expressed as a percentage between 0 and 100. This argument can be any real-valued expression. (Other sampling methods might accept more or different arguments.) These two methods each return a randomly-chosen sample of the table that will contain approximately the specified percentage of the table’s rows. The BERNOULLI method scans the whole table and selects or ignores individual rows independently with the specified probability. The SYSTEM method does block-level sampling with each block having the specified chance of being selected; all rows in each selected block are returned. The SYSTEM method is significantly faster than the BERNOULLI method when small sampling percentages are specified, but it may return a less-random sample of the table as a result of clustering effects.

    • The optional REPEATABLE clause specifies a seed number or expression to use for generating random numbers within the sampling method. The seed value can be any non-null floating-point value. Two queries that specify the same seed and argument values will select the same sample of the table, if the table has not been changed meanwhile. But different seed values will usually produce different samples. If REPEATABLE is not given then a new random sample is selected for each query, based upon a system-generated seed. Note that some add-on sampling methods do not accept REPEATABLE, and will always produce new samples on each use.

  • select

    • A sub-SELECT can appear in the FROM clause. This acts as though its output were created as a temporary table for the duration of this single SELECT command. Note that the sub-SELECT must be surrounded by parentheses, and an alias can be provided in the same way as for a table. A VALUES command can also be used here.

  • with_query_name

    • A WITH query is referenced by writing its name, just as though the query’s name were a table name. (In fact, the WITH query hides any real table of the same name for the purposes of the primary query. If necessary, you can refer to a real table of the same name by schema-qualifying the table’s name.) An alias can be provided in the same way as for a table.

  • function_name

    • Function calls can appear in the FROM clause. (This is especially useful for functions that return result sets, but any function can be used.) This acts as though the function’s output were created as a temporary table for the duration of this single SELECT command. If the function’s result type is composite (including the case of a function with multiple OUT parameters), each attribute becomes a separate column in the implicit table.

    • When the optional WITH ORDINALITY clause is added to the function call, an additional column of type bigint will be appended to the function’s result column(s). This column numbers the rows of the function’s result set, starting from 1. By default, this column is named ordinality.

    • An alias can be provided in the same way as for a table. If an alias is written, a column alias list can also be written to provide substitute names for one or more attributes of the function’s composite return type, including the ordinality column if present.

    • Multiple function calls can be combined into a single FROM-clause item by surrounding them with ROWS FROM( …​ ). The output of such an item is the concatenation of the first row from each function, then the second row from each function, etc. If some of the functions produce fewer rows than others, null values are substituted for the missing data, so that the total number of rows returned is always the same as for the function that produced the most rows.

    • If the function has been defined as returning the record data type, then an alias or the key word AS must be present, followed by a column definition list in the form ( _column_name data_type [, …​ ])_. The column definition list must match the actual number and types of columns returned by the function.

    • When using the ROWS FROM( …​ ) syntax, if one of the functions requires a column definition list, it’s preferred to put the column definition list after the function call inside ROWS FROM( …​ ). A column definition list can be placed after the ROWS FROM( …​ ) construct only if there’s just a single function and no WITH ORDINALITY clause.

    • To use ORDINALITY together with a column definition list, you must use the ROWS FROM( …​ ) syntax and put the column definition list inside ROWS FROM( …​ ).

  • join_type

    • One of

    • For the INNER and OUTER join types, a join condition must be specified, namely exactly one of ON _join_condition_, USING (_join_column [, …​]), or _NATURAL. See below for the meaning.

    • A JOIN clause combines two FROM items, which for convenience we will refer to as “tables”, though in reality they can be any type of FROM item. Use parentheses if necessary to determine the order of nesting. In the absence of parentheses, JOIN_s nest left-to-right. In any case _JOIN binds more tightly than the commas separating FROM-list items. All the JOIN options are just a notational convenience, since they do nothing you couldn’t do with plain FROM and WHERE.

    • LEFT OUTER JOIN returns all rows in the qualified Cartesian product (i.e., all combined rows that pass its join condition), plus one copy of each row in the left-hand table for which there was no right-hand row that passed the join condition. This left-hand row is extended to the full width of the joined table by inserting null values for the right-hand columns. Note that only the JOIN clause’s own condition is considered while deciding which rows have matches. Outer conditions are applied afterwards.

    • Conversely, RIGHT OUTER JOIN returns all the joined rows, plus one row for each unmatched right-hand row (extended with nulls on the left). This is just a notational convenience, since you could convert it to a LEFT OUTER JOIN by switching the left and right tables.

    • FULL OUTER JOIN returns all the joined rows, plus one row for each unmatched left-hand row (extended with nulls on the right), plus one row for each unmatched right-hand row (extended with nulls on the left).

  • ON _join_condition_

    • join_condition is an expression resulting in a value of type boolean (similar to a WHERE clause) that specifies which rows in a join are considered to match.

  • USING ( _join_column [, …​] ) [ AS join_using_alias ]_

    • A clause of the form USING ( a, b, …​ ) is shorthand for ON left_table.a = right_table.a AND left_table.b = right_table.b …​. Also, USING implies that only one of each pair of equivalent columns will be included in the join output, not both.

    • If a join_using_alias name is specified, it provides a table alias for the join columns. Only the join columns listed in the USING clause are addressable by this name. Unlike a regular alias, this does not hide the names of the joined tables from the rest of the query. Also unlike a regular alias, you cannot write a column alias list — the output names of the join columns are the same as they appear in the USING list.

  • NATURAL

    • NATURAL is shorthand for a USING list that mentions all columns in the two tables that have matching names. If there are no common column names, NATURAL is equivalent to ON TRUE.

  • CROSS JOIN

    • CROSS JOIN is equivalent to INNER JOIN ON (TRUE), that is, no rows are removed by qualification. They produce a simple Cartesian product, the same result as you get from listing the two tables at the top level of FROM, but restricted by the join condition (if any).

  • LATERAL

    • The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)

    • LATERAL can also precede a function-call FROM item, but in this case it is a noise word, because the function expression can refer to earlier FROM items in any case.

    • A LATERAL item can appear at top level in the FROM list, or within a JOIN tree. In the latter case it can also refer to any items that are on the left-hand side of a JOIN that it is on the right-hand side of.

    • When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the FROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the LATERAL item is evaluated using that row or row set’s values of the columns. The resulting row(s) are joined as usual with the rows they were computed from. This is repeated for each row or set of rows from the column source table(s).

    • The column source table(s) must be INNER or LEFT joined to the LATERAL item, else there would not be a well-defined set of rows from which to compute each set of rows for the LATERAL item. Thus, although a construct such as X_ RIGHT JOIN LATERAL Y is syntactically valid, it is not actually allowed for _Y to reference X.

WHERE Clause

可选的 WHERE 子句具有以下通用形式:

The optional WHERE clause has the general form

WHERE condition

其中 condition 是任何评估为 boolean 类型结果的表达式。任何不满足此条件的行都将从输出中消除。当用实际行值替换所有变量引用时,如果行返回 true,则表示该行满足条件。

where condition is any expression that evaluates to a result of type boolean. Any row that does not satisfy this condition will be eliminated from the output. A row satisfies the condition if it returns true when the actual row values are substituted for any variable references.

GROUP BY Clause

可选 GROUP BY 子句的常规格式如下:

The optional GROUP BY clause has the general form

GROUP BY [ ALL | DISTINCT ] grouping_element [, ...]

GROUP BY 将按组表达式共用相同值的全部选定行浓缩到单行中。 expression 中使用的 grouping_element 可以是输入列名,或者输出列的名称或序号( SELECT 列表项),也可以是根据输入列值生成任意表达式。在歧义的情况下, GROUP BY 名称会被解释为输入列名,而不是输出列名。

GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. An expression used inside a grouping_element can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an input-column name rather than an output column name.

如果 GROUPING SETSROLLUPCUBE 中的任意一个以组元素形式出现,则 GROUP BY 子句作为整体定义了部分独立 grouping sets 。其效果相当于在以各组集为 GROUP BY 子句的子查询之间构建一个 UNION ALL 。可选 DISTINCT 子句会在处理之前删除重复集;不会将其 not 转换 UNION ALLUNION DISTINCT 。有关组集处理的进一步详细信息,请参阅 Section 7.2.4

If any of GROUPING SETS, ROLLUP or CUBE are present as grouping elements, then the GROUP BY clause as a whole defines some number of independent grouping sets. The effect of this is equivalent to constructing a UNION ALL between subqueries with the individual grouping sets as their GROUP BY clauses. The optional DISTINCT clause removes duplicate sets before processing; it does not transform the UNION ALL into a UNION DISTINCT. For further details on the handling of grouping sets see Section 7.2.4.

如果有任何聚合函数,则会针对组成每个组的全部行计算该函数,为每个组生成单独值。(如果聚合函数可用但没有 GROUP BY 子句,则该查询被视为具有包含全部选定行的单个组。)可以附加 FILTER 子句到聚合函数调用处来进一步过滤提供给每个聚合函数的行集;有关详细信息,请参阅 Section 4.2.7 。当 FILTER 子句出现时,只有与之匹配的行才会包含在那个聚合函数的输入中。

Aggregate functions, if any are used, are computed across all rows making up each group, producing a separate value for each group. (If there are aggregate functions but no GROUP BY clause, the query is treated as having a single group comprising all the selected rows.) The set of rows fed to each aggregate function can be further filtered by attaching a FILTER clause to the aggregate function call; see Section 4.2.7 for more information. When a FILTER clause is present, only those rows matching it are included in the input to that aggregate function.

GROUP BY 出现时,或者任何聚合函数出现时, SELECT 列表表达式引用未分组的列都是无效的(聚合函数中或未分组的列在功能上依赖于分组的列时除外),因为这样未分组的列将会返回更多可能的单个值。如果分组的列(或其中的一部分)是包含未分组的列的表的表主键,则存在功能依赖性。

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

请记住,在评估 HAVING 子句或 SELECT 列表中的任何“标量”表达式之前评估所有聚合函数。这意味着 CASE 表达式不可用于跳过聚合函数的评估,请参阅 Section 4.2.14

Keep in mind that all aggregate functions are evaluated before evaluating any “scalar” expressions in the HAVING clause or SELECT list. This means that, for example, a CASE expression cannot be used to skip evaluation of an aggregate function; see Section 4.2.14.

目前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 不能用 GROUP BY 指定。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified with GROUP BY.

HAVING Clause

可选 HAVING 子句的常规格式如下:

The optional HAVING clause has the general form

HAVING condition

其中 condition 与为 WHERE 从句指定的内容相同。

where condition is the same as specified for the WHERE clause.

HAVING 消除不满足条件的分组行。 HAVING 不同于 WHERE : WHERE 在应用 GROUP BY 前过滤各个行,而 HAVING 过滤由 GROUP BY 创建的分组行。 condition 中引用的每列都必须明确地引用分组列,除非引用出现在聚集函数内或未分组的列在功能上依赖于分组列。

HAVING eliminates group rows that do not satisfy the condition. HAVING is different from WHERE: WHERE filters individual rows before the application of GROUP BY, while HAVING filters group rows created by GROUP BY. Each column referenced in condition must unambiguously reference a grouping column, unless the reference appears within an aggregate function or the ungrouped column is functionally dependent on the grouping columns.

即使不存在 GROUP BY 从句, HAVING 的存在也会将查询转换为分组查询。这与查询包含聚合函数而没有 GROUP BY 从句时所发生的相同。所有选定的行都被视为形成单个组, SELECT 列表和 HAVING 从句只能引用聚合函数内的表列。如果 HAVING 条件为真,此类查询将发出单行;如果为假,将发出零行。

The presence of HAVING turns a query into a grouped query even if there is no GROUP BY clause. This is the same as what happens when the query contains aggregate functions but no GROUP BY clause. All the selected rows are considered to form a single group, and the SELECT list and HAVING clause can only reference table columns from within aggregate functions. Such a query will emit a single row if the HAVING condition is true, zero rows if it is not true.

当前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 无法与 HAVING 一起指定。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified with HAVING.

WINDOW Clause

可选 WINDOW 从句具有以下通用形式:

The optional WINDOW clause has the general form

WINDOW window_name AS ( window_definition ) [, ...]

其中 window_name 是可从 OVER 从句或后续窗口定义引用的名称, window_definition

where window_name is a name that can be referenced from OVER clauses or subsequent window definitions, and window_definition is

[ existing_window_name ]
[ PARTITION BY expression [, ...] ]
[ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ frame_clause ]

如果指定了 existing_window_name ,它必须引用 WINDOW 列表中的较早条目;新窗口从该条目复制其分区从句,如果存在,则也复制其排序从句。在这种情况下,新窗口无法指定自己的 PARTITION BY 从句,并且只能在复制的窗口没有 ORDER BY 从句时指定 ORDER BY 。新窗口始终使用自己的框架从句;复制的窗口不得指定框架从句。

If an existing_window_name is specified it must refer to an earlier entry in the WINDOW list; the new window copies its partitioning clause from that entry, as well as its ordering clause if any. In this case the new window cannot specify its own PARTITION BY clause, and it can specify ORDER BY only if the copied window does not have one. The new window always uses its own frame clause; the copied window must not specify a frame clause.

PARTITION BY 列表的元素的解释方式与 GROUP BY 从句的元素的解释方式基本相同,但它们总是简单的表达式,从不作为输出列的名称或编号。另一个不同之处在于,这些表达式可以包含聚合函数调用,而 GROUP BY 从句中不允许聚合函数调用。它们在此处允许,因为窗口化发生在分组和聚合之后。

The elements of the PARTITION BY list are interpreted in much the same fashion as elements of a GROUP BY clause, except that they are always simple expressions and never the name or number of an output column. Another difference is that these expressions can contain aggregate function calls, which are not allowed in a regular GROUP BY clause. They are allowed here because windowing occurs after grouping and aggregation.

类似地, ORDER BY 列表的元素的解释方式与语句级 ORDER BY 从句的元素的解释方式基本相同,但表达式始终作为简单表达式,从不作为输出列的名称或编号来取用。

Similarly, the elements of the ORDER BY list are interpreted in much the same fashion as elements of a statement-level ORDER BY clause, except that the expressions are always taken as simple expressions and never the name or number of an output column.

可选 frame_clause 定义了依赖于框架(并非所有都依赖)的窗口函数的 window frame 。窗口框架是查询的每一行的相关行集(称为 current row )。 frame_clause 可以是以下之一:

The optional frame_clause defines the window frame for window functions that depend on the frame (not all do). The window frame is a set of related rows for each row of the query (called the current row). The frame_clause can be one of

{ RANGE | ROWS | GROUPS } frame_start [ frame_exclusion ]
{ RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end [ frame_exclusion ]

其中 frame_startframe_end 可以是以下之一

where frame_start and frame_end can be one of

UNBOUNDED PRECEDING
offset PRECEDING
CURRENT ROW
offset FOLLOWING
UNBOUNDED FOLLOWING

frame_exclusion 可以是以下之一

and frame_exclusion can be one of

EXCLUDE CURRENT ROW
EXCLUDE GROUP
EXCLUDE TIES
EXCLUDE NO OTHERS

如果省略 frame_end ,则默认为 CURRENT ROW 。限制条件是 frame_start 不能为 UNBOUNDED FOLLOWINGframe_end 不能为 UNBOUNDED PRECEDING ,并且 frame_end 选项不得出现在上述 frame_startframe_end 选项列表中 frame_start 选项之前——例如, RANGE BETWEEN CURRENT ROW AND _offset PRECEDING_ 不允许。

If frame_end is omitted it defaults to CURRENT ROW. Restrictions are that frame_start cannot be UNBOUNDED FOLLOWING, frame_end cannot be UNBOUNDED PRECEDING, and the frame_end choice cannot appear earlier in the above list of frame_start and frame_end options than the frame_start choice does — for example RANGE BETWEEN CURRENT ROW AND _offset PRECEDING_ is not allowed.

默认框架选项是 RANGE UNBOUNDED PRECEDING ,与 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW 相同;它将框架设置为从分区开始到当前行的最后一个 peer (窗口的 ORDER BY 从句认为与当前行等效的行;如果不存在 ORDER BY ,则所有行都是对等行)。一般来说, UNBOUNDED PRECEDING 表示框架从分区的第一个行开始,类似地, UNBOUNDED FOLLOWING 表示框架以分区的最后一个行结束,无论 RANGEROWS 还是 GROUPS 模式如何。在 ROWS 模式中, CURRENT ROW 表示框架以当前行开始或结束;但在 RANGEGROUPS 模式中,它表示框架以 ORDER BY 排序中当前行的第一个或最后一个对等行开始或结束。 offset PRECEDINGoffset FOLLOWING 选项的含义取决于框架模式。在 ROWS 模式中, offset 是一个整数,表示框架在当前行之前或之后开始或结束这么多个行。在 GROUPS 模式中, offset 是一个整数,表示框架在当前行的对等组之前或之后开始或结束这么多个对等组,其中 peer group 是一组根据窗口的 ORDER BY 从句等价的行。在 RANGE 模式中,使用 offset 选项要求窗口定义中只存在一个 ORDER BY 列。然后,框架包含那些排序列值不比当前行的排序列值少 offset (对于 PRECEDING )或多 FOLLOWING (对于 FOLLOWING )的行。在这些情况下, offset 表达式的数据类型取决于排序列的数据类型。对于数字排序列,它通常与排序列的类型相同,但对于 datetime 排序列,它是一个 interval 。在所有这些情况下, offset 的值都必须是非 null 值和非负值。此外,虽然 offset 不必是一个简单常量,但它不能包含变量、聚合函数或窗口函数。

The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW; it sets the frame to be all rows from the partition start up through the current row’s last peer (a row that the window’s ORDER BY clause considers equivalent to the current row; all rows are peers if there is no ORDER BY). In general, UNBOUNDED PRECEDING means that the frame starts with the first row of the partition, and similarly UNBOUNDED FOLLOWING means that the frame ends with the last row of the partition, regardless of RANGE, ROWS or GROUPS mode. In ROWS mode, CURRENT ROW means that the frame starts or ends with the current row; but in RANGE or GROUPS mode it means that the frame starts or ends with the current row’s first or last peer in the ORDER BY ordering. The offset PRECEDING and offset FOLLOWING options vary in meaning depending on the frame mode. In ROWS mode, the offset is an integer indicating that the frame starts or ends that many rows before or after the current row. In GROUPS mode, the offset is an integer indicating that the frame starts or ends that many peer groups before or after the current row’s peer group, where a peer group is a group of rows that are equivalent according to the window’s ORDER BY clause. In RANGE mode, use of an offset option requires that there be exactly one ORDER BY column in the window definition. Then the frame contains those rows whose ordering column value is no more than offset less than (for PRECEDING) or more than (for FOLLOWING) the current row’s ordering column value. In these cases the data type of the offset expression depends on the data type of the ordering column. For numeric ordering columns it is typically of the same type as the ordering column, but for datetime ordering columns it is an interval. In all these cases, the value of the offset must be non-null and non-negative. Also, while the offset does not have to be a simple constant, it cannot contain variables, aggregate functions, or window functions.

frame_exclusion 选项允许排除帧周围的当前行,即使它们会根据帧开始和帧结束选项包括在内。EXCLUDE CURRENT ROW 从帧中排除当前行。EXCLUDE GROUP 从帧中排除当前行及其排序对等行。EXCLUDE TIES 从帧中排除当前行的任何对等行,但不是当前行本身。EXCLUDE NO OTHERS 只是明确指定默认行为,即不排除当前行及其对等行。

The frame_exclusion option allows rows around the current row to be excluded from the frame, even if they would be included according to the frame start and frame end options. EXCLUDE CURRENT ROW excludes the current row from the frame. EXCLUDE GROUP excludes the current row and its ordering peers from the frame. EXCLUDE TIES excludes any peers of the current row from the frame, but not the current row itself. EXCLUDE NO OTHERS simply specifies explicitly the default behavior of not excluding the current row or its peers.

请注意,如果 ORDER BY 排序没有唯一地对行排序, ROWS 模式可能会产生不可预测的结果。 RANGEGROUPS 模式旨在确保 ORDER BY 排序中对等的行被同等对待:给定对等组的所有行都将处于框架中或从中被排除。

Beware that the ROWS mode can produce unpredictable results if the ORDER BY ordering does not order the rows uniquely. The RANGE and GROUPS modes are designed to ensure that rows that are peers in the ORDER BY ordering are treated alike: all rows of a given peer group will be in the frame or excluded from it.

WINDOW 从句的目的是指定 window functions 在查询的 SELECT listORDER BY 从句中出现的行为。这些函数可以在其 OVER 从句中按名称引用 WINDOW 从句条目。但是,不必在任何地方引用 WINDOW 从句条目;如果它不在查询中使用,则会被忽略。可以不使用任何 WINDOW 从句使用窗口函数,因为窗口函数调用可以在其 OVER 从句中直接指定其窗口定义。但是, WINDOW 从句在多个窗口函数需要相同的窗口定义时可以节省输入。

The purpose of a WINDOW clause is to specify the behavior of window functions appearing in the query’s SELECT list or ORDER BY clause. These functions can reference the WINDOW clause entries by name in their OVER clauses. A WINDOW clause entry does not have to be referenced anywhere, however; if it is not used in the query it is simply ignored. It is possible to use window functions without any WINDOW clause at all, since a window function call can specify its window definition directly in its OVER clause. However, the WINDOW clause saves typing when the same window definition is needed for more than one window function.

当前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 无法与 WINDOW 一起指定。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified with WINDOW.

窗口函数在 Section 3.5Section 4.2.8Section 7.2.5 中进行了详细描述。

Window functions are described in detail in Section 3.5, Section 4.2.8, and Section 7.2.5.

SELECT List

SELECT 列表(介于关键字 SELECTFROM 之间)指定形成 SELECT 语句的输出行的表达式。表达式可以引用(且通常引用)在 FROM 从句中计算的列。

The SELECT list (between the key words SELECT and FROM) specifies expressions that form the output rows of the SELECT statement. The expressions can (and usually do) refer to columns computed in the FROM clause.

就像在一个表中一样, SELECT 的每个输出列都有一个名称。在简单的 SELECT 中,该名称仅用于标记显示的列,但是当 SELECT 是更大查询的子查询时,更大查询会将该名称视为由子查询产生的虚拟表的列名称。要指定用于输出列的名称,请填写列表达式后编写 AS output_name 。(您可省略 AS ,但仅当所需的输出名称与任何 PostgreSQL 关键字不匹配时才可省略(请参阅 Appendix C )。为了防止将来可能添加关键字,建议您始终编写 AS 或双引号引住输出名称。)如果您没有指定列名称,则 PostgreSQL 将自动选择一个名称。如果列的表达式是简单的列引用,则所选名称与该列的名称相同。在更复杂的情况下,可以使用函数或类型名称,或者系统可能退回到 ?column? 这样的生成名称。

Just as in a table, every output column of a SELECT has a name. In a simple SELECT this name is just used to label the column for display, but when the SELECT is a sub-query of a larger query, the name is seen by the larger query as the column name of the virtual table produced by the sub-query. To specify the name to use for an output column, write AS output_name after the column’s expression. (You can omit AS, but only if the desired output name does not match any PostgreSQL keyword (see Appendix C). For protection against possible future keyword additions, it is recommended that you always either write AS or double-quote the output name.) If you do not specify a column name, a name is chosen automatically by PostgreSQL. If the column’s expression is a simple column reference then the chosen name is the same as that column’s name. In more complex cases a function or type name may be used, or the system may fall back on a generated name such as ?column?.

输出列的名称可用于引用 ORDER BYGROUP BY 从句中列的值,但不能用于 WHEREHAVING 从句,在 WHEREHAVING 从句中您必须编写表达式。

An output column’s name can be used to refer to the column’s value in ORDER BY and GROUP BY clauses, but not in the WHERE or HAVING clauses; there you must write out the expression instead.

替换 * 作为 table_name_ 中所有行的列的简写,而不使用表达式。此外,可以将 table_name_ .*_ 作为来至于该表的列的简写。在这些情况下,不可能用 AS 指定新名称;输出列名称将和表列名称相同。

Instead of an expression, * can be written in the output list as a shorthand for all the columns of the selected rows. Also, you can write _table_name.*_ as a shorthand for the columns coming from just that table. In these cases it is not possible to specify new names with AS; the output column names will be the same as the table columns' names.

根据 SQL 标准,在应用 DISTINCTORDER BYLIMIT 之前应计算输出列表中的表达式。当使用 DISTINCT 时,这一点非常有必要,否则无法明确区别哪些值被视为不同。然而,在很多情况下, ORDER BYLIMIT 之后计算输出表达式会很方便;尤其输出列表包含任何不稳定的或代价高昂的函数时。通过这种行为,函数评估的顺序将更加直观,并且不会有与未在输出中出现的行相对应的评估。只要这些表达式未在 DISTINCTORDER BYGROUP BY 中引用,PostgreSQL 就会在排序和限制后有效地评估输出表达式。(作为反例, SELECT f(x) FROM tab ORDER BY 1 显然必须在排序之前评估 f(x) 。)包含集返回函数的输出表达式会在排序之后和限制之前进行有效评估,从而 LIMIT 将有效地切断来自集返回函数的输出。

According to the SQL standard, the expressions in the output list should be computed before applying DISTINCT, ORDER BY, or LIMIT. This is obviously necessary when using DISTINCT, since otherwise it’s not clear what values are being made distinct. However, in many cases it is convenient if output expressions are computed after ORDER BY and LIMIT; particularly if the output list contains any volatile or expensive functions. With that behavior, the order of function evaluations is more intuitive and there will not be evaluations corresponding to rows that never appear in the output. PostgreSQL will effectively evaluate output expressions after sorting and limiting, so long as those expressions are not referenced in DISTINCT, ORDER BY or GROUP BY. (As a counterexample, SELECT f(x) FROM tab ORDER BY 1 clearly must evaluate f(x) before sorting.) Output expressions that contain set-returning functions are effectively evaluated after sorting and before limiting, so that LIMIT will act to cut off the output from a set-returning function.

Note

在 9.6 之前的 PostgreSQL 版本中,不保证输出表达式评估的时间相对排序和限制的时机;它取决于所选查询计划的形式。

PostgreSQL versions before 9.6 did not provide any guarantees about the timing of evaluation of output expressions versus sorting and limiting; it depended on the form of the chosen query plan.

DISTINCT Clause

如果指定 SELECT DISTINCT ,则将从结果集中删除所有重复行(从每组重复的行中保留一行)。 SELECT ALL 指定相反的操作:保留全部行;这是默认设置。

If SELECT DISTINCT is specified, all duplicate rows are removed from the result set (one row is kept from each group of duplicates). SELECT ALL specifies the opposite: all rows are kept; that is the default.

SELECT DISTINCT ON ( _expression [, …​] )_ 仅保留给定表达式评估结果相等的一组行中的第一行。 DISTINCT ON 表达式的解释遵循与 ORDER BY 相同的规则(请参阅上文)。请注意,除非使用 ORDER BY 确保所需的行最先出现,否则无法预测每组中的“第一行”。例如:

SELECT DISTINCT ON ( _expression [, …​] )_ keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example:

SELECT DISTINCT ON (location) location, time, report
    FROM weather_reports
    ORDER BY location, time DESC;

获取每个位置的最新天气报告。但如果没有使用 ORDER BY 来强制按时间值降序排列每个位置的时间,我们便会获取每个位置的不可预测时间报告。

retrieves the most recent weather report for each location. But if we had not used ORDER BY to force descending order of time values for each location, we’d have gotten a report from an unpredictable time for each location.

DISTINCT ON 表达式必须匹配最左边的 ORDER BY 表达式。 ORDER BY 子句通常包含其他表达式,这些表达式决定每个 DISTINCT ON 组中行的所需优先级。

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.

当前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 无法与 DISTINCT 一起指定。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified with DISTINCT.

UNION Clause

UNION 子句具有以下一般格式:

The UNION clause has this general form:

select_statement UNION [ ALL | DISTINCT ] select_statement

select_statement 是没有 ORDER BYLIMITFOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 子句的任何 SELECT 语句。(如果用圆括号括起来,则 ORDER BYLIMIT 能够附加到子表达式上。如果没有圆括号,则这些子句被认为适用于 UNION 的结果,而不是对其右侧的输入表达式。)

select_statement is any SELECT statement without an ORDER BY, LIMIT, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, or FOR KEY SHARE clause. (ORDER BY and LIMIT can be attached to a subexpression if it is enclosed in parentheses. Without parentheses, these clauses will be taken to apply to the result of the UNION, not to its right-hand input expression.)

UNION 运算符计算涉及 SELECT 语句返回的行集并集。如果一行出现在至少一个结果集中,则它出现在两组结果的并集中。表示 UNION 直接操作数的两个 SELECT 必须得生成相同数量的列,且相应列必须是兼容的数据类型。

The UNION operator computes the set union of the rows returned by the involved SELECT statements. A row is in the set union of two result sets if it appears in at least one of the result sets. The two SELECT statements that represent the direct operands of the UNION must produce the same number of columns, and corresponding columns must be of compatible data types.

除非指定 ALL 选项,否则 UNION 的结果中不包含任何重复行。 ALL 阻止消除重复项。(因此, UNION ALL 通常比 UNION 明显更快;若可以,请使用 ALL 。)可以编写 DISTINCT 来明确指定消除重复行的默认行为。

The result of UNION does not contain any duplicate rows unless the ALL option is specified. ALL prevents elimination of duplicates. (Therefore, UNION ALL is usually significantly quicker than UNION; use ALL when you can.) DISTINCT can be written to explicitly specify the default behavior of eliminating duplicate rows.

同一 SELECT 语句中的多个 UNION 运算符将从左向右评估,除非用圆括号另行指定。

Multiple UNION operators in the same SELECT statement are evaluated left to right, unless otherwise indicated by parentheses.

当前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 既无法为 UNION 结果指定,也无法为 UNION 的任何输入指定。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified either for a UNION result or for any input of a UNION.

INTERSECT Clause

INTERSECT 子句具有以下一般格式:

The INTERSECT clause has this general form:

select_statement INTERSECT [ ALL | DISTINCT ] select_statement

select_statement 是没有 ORDER BYLIMITFOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 子句的任何 SELECT 语句。

select_statement is any SELECT statement without an ORDER BY, LIMIT, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, or FOR KEY SHARE clause.

INTERSECT 运算符计算涉及 SELECT 语句返回的行集交集。如果一行出现在两个结果集中,则它出现在两个结果集的交集中。

The INTERSECT operator computes the set intersection of the rows returned by the involved SELECT statements. A row is in the intersection of two result sets if it appears in both result sets.

除非指定 ALL 选项,否则 INTERSECT 的结果中不包含任何重复行。使用 ALL 时,如果一行在左表中有 m 个重复项,在右表中含有 n 个重复项,则它将出现在结果集中的 min( mn ) 次。可以编写 DISTINCT 来明确指定消除重复行的默认行为。

The result of INTERSECT does not contain any duplicate rows unless the ALL option is specified. With ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear min(m,n) times in the result set. DISTINCT can be written to explicitly specify the default behavior of eliminating duplicate rows.

同一 SELECT 语句中的多个 INTERSECT 运算符从左到右进行计算,除非括号另有指示。 INTERSECT 绑定得比 UNION 更紧密。也就是说, A UNION B INTERSECT C 将读作 A UNION (B INTERSECT C)

Multiple INTERSECT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise. INTERSECT binds more tightly than UNION. That is, A UNION B INTERSECT C will be read as A UNION (B INTERSECT C).

目前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 均不能指定为 INTERSECT 结果或任何 INTERSECT 输入的结果。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified either for an INTERSECT result or for any input of an INTERSECT.

EXCEPT Clause

EXCEPT 子句具有此通用格式:

The EXCEPT clause has this general form:

select_statement EXCEPT [ ALL | DISTINCT ] select_statement

select_statement 是没有 ORDER BYLIMITFOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 子句的任何 SELECT 语句。

select_statement is any SELECT statement without an ORDER BY, LIMIT, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, or FOR KEY SHARE clause.

EXCEPT 运算符计算一组行,这些行在左 SELECT 语句的结果中,但在右语句的结果中不存在。

The EXCEPT operator computes the set of rows that are in the result of the left SELECT statement but not in the result of the right one.

除非指定了 ALL 选项,否则 EXCEPT 的结果中不会包含任何重复行。使用 ALL 时,如果某一行在左表中有 m 个重复项,在右表中有 n 个重复项,则该行将在结果集中出现 max( m - n ,0) 次。可以编写 DISTINCT 以明确指定消除重复行的默认行为。

The result of EXCEPT does not contain any duplicate rows unless the ALL option is specified. With ALL, a row that has m duplicates in the left table and n duplicates in the right table will appear max(m-n,0) times in the result set. DISTINCT can be written to explicitly specify the default behavior of eliminating duplicate rows.

同一 SELECT 语句中的多个 EXCEPT 运算符从左向右求值,除非括号另有规定。 EXCEPTUNION 在同一级别结合。

Multiple EXCEPT operators in the same SELECT statement are evaluated left to right, unless parentheses dictate otherwise. EXCEPT binds at the same level as UNION.

目前, FOR NO KEY UPDATEFOR UPDATEFOR SHAREFOR KEY SHARE 均不能指定为 EXCEPT 结果或任何 EXCEPT 输入的结果。

Currently, FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE and FOR KEY SHARE cannot be specified either for an EXCEPT result or for any input of an EXCEPT.

ORDER BY Clause

可选的 ORDER BY 子句具有此通用格式:

The optional ORDER BY clause has this general form:

ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...]

ORDER BY 子句会按照指定的表达式对结果行进行排序。如果根据最左边的表达式两行相等,那么它们将根据下一表达式进行比较,依此类推。如果它们根据所有指定的表达式相等,那么它们将按照实现依赖顺序返回。

The ORDER BY clause causes the result rows to be sorted according to the specified expression(s). If two rows are equal according to the leftmost expression, they are compared according to the next expression and so on. If they are equal according to all specified expressions, they are returned in an implementation-dependent order.

每个 expression 可能是输出列的名称或序数( SELECT 列表项),或者是根据输入列值形成的任意表达式。

Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values.

序数是指输出列的序数(从左到右)位置。此功能使得可以根据没有唯一名称的列定义排序规则。这绝对不是必需的,因为始终可以使用 AS 子句为输出列分配一个名称。

The ordinal number refers to the ordinal (left-to-right) position of the output column. This feature makes it possible to define an ordering on the basis of a column that does not have a unique name. This is never absolutely necessary because it is always possible to assign a name to an output column using the AS clause.

还可以在 ORDER BY 子句中使用任意表达式,包括 SELECT 输出列表中未出现的列。因此,以下语句有效:

It is also possible to use arbitrary expressions in the ORDER BY clause, including columns that do not appear in the SELECT output list. Thus the following statement is valid:

SELECT name FROM distributors ORDER BY code;

此功能的一个限制是,应用于 UNIONINTERSECTEXCEPT 子句结果的 ORDER BY 子句只能指定输出列名称或序号,而不能指定表达式。

A limitation of this feature is that an ORDER BY clause applying to the result of a UNION, INTERSECT, or EXCEPT clause can only specify an output column name or number, not an expression.

如果 ORDER BY 表达式是与输出列名称和输入列名称都匹配的简单名称, ORDER BY 将其解释为输出列名称。这与 GROUP BY 在相同情况下做出的选择相反。这种不一致性是为了兼容 SQL 标准。

If an ORDER BY expression is a simple name that matches both an output column name and an input column name, ORDER BY will interpret it as the output column name. This is the opposite of the choice that GROUP BY will make in the same situation. This inconsistency is made to be compatible with the SQL standard.

ORDER BY 子句中的任何表达式后面,可以选择添加关键字 ASC (升序)或 DESC (降序)。如果未指定,则默认假定为 ASC 。或者,可以在 USING 子句中指定特定的排序运算符名称。排序运算符必须是某些 B 树运算符族中的小于或大于成员。 ASC 通常相当于 USING <DESC 通常相当于 USING > 。(但是用户定义数据类型的创建者可以确切地定义默认排序顺序是什么,并且它可能对应于带其他名称的运算符。)

Optionally one can add the key word ASC (ascending) or DESC (descending) after any expression in the ORDER BY clause. If not specified, ASC is assumed by default. Alternatively, a specific ordering operator name can be specified in the USING clause. An ordering operator must be a less-than or greater-than member of some B-tree operator family. ASC is usually equivalent to USING < and DESC is usually equivalent to USING >. (But the creator of a user-defined data type can define exactly what the default sort ordering is, and it might correspond to operators with other names.)

如果指定了 NULLS LAST ,则 null 值将排在所有非 null 值之后;如果指定了 NULLS FIRST ,则 null 值将排在所有非 null 值之前。如果均未指定,则默认行为将在指定或暗示 ASC 时为 NULLS LAST ,在指定 DESC 时为 NULLS FIRST (因此,默认行为为将 null 值作为大于非 null 值)。当指定了 USING 时,默认 null 值排序取决于运算符是小于运算符还是大于运算符。

If NULLS LAST is specified, null values sort after all non-null values; if NULLS FIRST is specified, null values sort before all non-null values. If neither is specified, the default behavior is NULLS LAST when ASC is specified or implied, and NULLS FIRST when DESC is specified (thus, the default is to act as though nulls are larger than non-nulls). When USING is specified, the default nulls ordering depends on whether the operator is a less-than or greater-than operator.

请注意,排序选项仅适用于它们后面的表达式;例如, ORDER BY x, y DESCORDER BY x DESC, y DESC 的含义不同。

Note that ordering options apply only to the expression they follow; for example ORDER BY x, y DESC does not mean the same thing as ORDER BY x DESC, y DESC.

字符字符串数据将按照应用于要排序的列的排序规则进行排序。如有需要,可以通过在 expression 中包含 COLLATE 子句来覆盖该排序规则,例如 ORDER BY mycolumn COLLATE "en_US" 。有关详细信息,请参见 Section 4.2.10Section 24.2

Character-string data is sorted according to the collation that applies to the column being sorted. That can be overridden at need by including a COLLATE clause in the expression, for example ORDER BY mycolumn COLLATE "en_US". For more information see Section 4.2.10 and Section 24.2.

LIMIT Clause

LIMIT 子句包含两个独立的子句:

The LIMIT clause consists of two independent sub-clauses:

LIMIT { count | ALL }
OFFSET start

参数 count 指定要返回的最大行数,而 start 指定在开始返回行之前要跳过的行数。在指定这两个参数时,将在开始计算要返回的 count 行之前跳过 start 行。

The parameter count specifies the maximum number of rows to return, while start specifies the number of rows to skip before starting to return rows. When both are specified, start rows are skipped before starting to count the count rows to be returned.

如果 count 表达式计算为 NULL,则将其视为 LIMIT ALL ,即没有限制。如果 start 计算为 NULL,则将其视为 OFFSET 0

If the count expression evaluates to NULL, it is treated as LIMIT ALL, i.e., no limit. If start evaluates to NULL, it is treated the same as OFFSET 0.

SQL:2008 引入了不同的语法来实现相同的结果,PostgreSQL 也支持该语法。语法为:

SQL:2008 introduced a different syntax to achieve the same result, which PostgreSQL also supports. It is:

OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }

在此语法中,标准要求 startcount 值为字面量常量、参数或变量名;作为 PostgreSQL 扩展,允许使用其他表达式,但通常需要用括号括起来以避免歧义。如果 countFETCH 子句中被忽略,则默认为 1。 WITH TIES 选项用于返回根据 ORDER BY 子句与结果集中最后一行相匹配的任何其他行;在这种情况下, ORDER BY 是强制性的,而 SKIP LOCKED 是不允许的。 ROWROWS 以及 FIRSTNEXT 是干扰词,不会影响这些子句的效果。根据标准,如果 OFFSET 子句和 FETCH 子句同时存在,则 OFFSET 子句必须在 FETCH 子句之前;但 PostgreSQL 比较宽松,允许任何顺序。

In this syntax, the start or count value is required by the standard to be a literal constant, a parameter, or a variable name; as a PostgreSQL extension, other expressions are allowed, but will generally need to be enclosed in parentheses to avoid ambiguity. If count is omitted in a FETCH clause, it defaults to 1. The WITH TIES option is used to return any additional rows that tie for the last place in the result set according to the ORDER BY clause; ORDER BY is mandatory in this case, and SKIP LOCKED is not allowed. ROW and ROWS as well as FIRST and NEXT are noise words that don’t influence the effects of these clauses. According to the standard, the OFFSET clause must come before the FETCH clause if both are present; but PostgreSQL is laxer and allows either order.

使用 LIMIT 时,最好使用 ORDER BY 子句将结果行限制为唯一的顺序。否则,你将得到查询行的一个不可预测的子集——你可能要求从第 10 行到第 20 行,但从第 10 行到第 20 行的顺序是什么?如果不指定 ORDER BY ,你不知道顺序是什么。

When using LIMIT, it is a good idea to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query’s rows — you might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? You don’t know what ordering unless you specify ORDER BY.

查询计划器在生成查询计划时会考虑 LIMIT ,因此根据你对 LIMITOFFSET 的使用,你很可能会获得不同的计划(产生不同的行顺序)。因此,使用不同的 LIMIT / OFFSET 值来选择查询结果的不同子集 will give inconsistent results ,除非你使用 ORDER BY 强制执行可预测的结果排序。这不是一个错误。这是 SQL 不承诺以任何特定顺序提供查询结果这一事实所固有的后果,除非使用 ORDER BY 来限制顺序。

The query planner takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you use for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.

如果不存在 ORDER BY 来强制选择确定性的子集,则甚至可能对同一 LIMIT 查询的重复执行返回表的不同行子集。同样,这不是一个错误;在这种情况下,结果的确定性根本无法得到保证。

It is even possible for repeated executions of the same LIMIT query to return different subsets of the rows of a table, if there is not an ORDER BY to enforce selection of a deterministic subset. Again, this is not a bug; determinism of the results is simply not guaranteed in such a case.

The Locking Clause

FOR UPDATEFOR NO KEY UPDATEFOR SHAREFOR KEY SHARElocking clauses ;它们影响 SELECT 在从表中获取行时如何锁定行。

FOR UPDATE, FOR NO KEY UPDATE, FOR SHARE and FOR KEY SHARE are locking clauses; they affect how SELECT locks rows as they are obtained from the table.

锁定子句的通用形式为:

The locking clause has the general form

FOR lock_strength [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ]

其中 lock_strength 可以是以下选项之一:

where lock_strength can be one of

UPDATE
NO KEY UPDATE
SHARE
KEY SHARE

有关每个行级锁定模式的详细信息,请参阅 Section 13.3.2

For more information on each row-level lock mode, refer to Section 13.3.2.

若要防止操作等待其他事务提交,请使用 NOWAITSKIP LOCKED 选项。使用 NOWAIT 时,如果无法立即锁定所选行,则语句会报告错误,而不是等待。使用 SKIP LOCKED 时,将跳过无法立即锁定的任何所选行。跳过已锁定的行会提供不一致的数据视图,因此这不适用于一般用途的工作,但可以用来避免多个使用者访问类似队列的表时发生锁定争用。请注意, NOWAITSKIP LOCKED 仅适用于行级锁定——仍然会以普通方式取得所需的 ROW SHARE 表级锁定(参见 Chapter 13 )。如果需要在不等待的情况下获取表级锁定,则可以将 LOCKNOWAIT 选项一起使用。

To prevent the operation from waiting for other transactions to commit, use either the NOWAIT or SKIP LOCKED option. With NOWAIT, the statement reports an error, rather than waiting, if a selected row cannot be locked immediately. With SKIP LOCKED, any selected rows that cannot be immediately locked are skipped. Skipping locked rows provides an inconsistent view of the data, so this is not suitable for general purpose work, but can be used to avoid lock contention with multiple consumers accessing a queue-like table. Note that NOWAIT and SKIP LOCKED apply only to the row-level lock(s) — the required ROW SHARE table-level lock is still taken in the ordinary way (see Chapter 13). You can use LOCK with the NOWAIT option first, if you need to acquire the table-level lock without waiting.

如果在锁定子句中命名了特定表,则只有来自这些表的行被锁定;在 SELECT 中使用的任何其他表都只是像往常一样被读取。不含表列表的锁定子句会影响语句中使用的所有表。如果锁定子句应用于视图或子查询,则它会影响视图或子查询中使用的所有表。但是,这些子句不适用于主查询引用的 WITH 查询。如果你希望在 WITH 查询中发生行锁定,请在 WITH 查询中指定一个锁定子句。

If specific tables are named in a locking clause, then only rows coming from those tables are locked; any other tables used in the SELECT are simply read as usual. A locking clause without a table list affects all tables used in the statement. If a locking clause is applied to a view or sub-query, it affects all tables used in the view or sub-query. However, these clauses do not apply to WITH queries referenced by the primary query. If you want row locking to occur within a WITH query, specify a locking clause within the WITH query.

如果需要为不同的表指定不同的锁定行为,可以编写多个锁定子句。如果同一张表被多个锁定子句提及(或隐式影响),则会将其视为仅由最强的一个锁定子句指定来处理。类似地,如果在影响某张表的任何子句中指定为 NOWAIT ,则该表会被视为 NOWAIT 。否则,如果将其视为在影响它的任何子句中指定,则将其视为 SKIP LOCKED

Multiple locking clauses can be written if it is necessary to specify different locking behavior for different tables. If the same table is mentioned (or implicitly affected) by more than one locking clause, then it is processed as if it was only specified by the strongest one. Similarly, a table is processed as NOWAIT if that is specified in any of the clauses affecting it. Otherwise, it is processed as SKIP LOCKED if that is specified in any of the clauses affecting it.

锁定子句不能在无法将返回的行清楚地标识为单个表行的上下文中使用;例如,它们不能与聚合一起使用。

The locking clauses cannot be used in contexts where returned rows cannot be clearly identified with individual table rows; for example they cannot be used with aggregation.

当锁定子句出现在 SELECT 查询的顶层时,被锁定的行正是由该查询返回的行;若是联接查询,则被锁定的行是形成返回联接行的行。此外,满足查询条件(按查询快照)的行将被锁定,尽管如果这些行在快照之后被更新且不再满足查询条件,则不会返回这些行。如果使用了 LIMIT ,则在返回足够的行来满足限制条件时锁定会停止(但请注意,被 OFFSET 忽略的行将被锁定)。类似地,如果在游标的查询中使用了锁定子句,只有游标实际获取或逾越的行才会被锁定。

When a locking clause appears at the top level of a SELECT query, the rows that are locked are exactly those that are returned by the query; in the case of a join query, the rows locked are those that contribute to returned join rows. In addition, rows that satisfied the query conditions as of the query snapshot will be locked, although they will not be returned if they were updated after the snapshot and no longer satisfy the query conditions. If a LIMIT is used, locking stops once enough rows have been returned to satisfy the limit (but note that rows skipped over by OFFSET will get locked). Similarly, if a locking clause is used in a cursor’s query, only rows actually fetched or stepped past by the cursor will be locked.

当锁定子句出现在子 SELECT 中时,被锁定的行是由子查询返回给外部查询的行。这可能包含比仅检查子查询本身所显示的更少行,因为外部查询中的条件可能用于优化子查询的执行。例如:

When a locking clause appears in a sub-SELECT, the rows locked are those returned to the outer query by the sub-query. This might involve fewer rows than inspection of the sub-query alone would suggest, since conditions from the outer query might be used to optimize execution of the sub-query. For example,

SELECT * FROM (SELECT * FROM mytable FOR UPDATE) ss WHERE col1 = 5;

将只锁定带有 col1 = 5 的行,即使子查询中没有这个条件。

will lock only rows having col1 = 5, even though that condition is not textually within the sub-query.

以前的版本无法保留由以后的保存点升级的锁定。例如,以下代码:

Previous releases failed to preserve a lock which is upgraded by a later savepoint. For example, this code:

BEGIN;
SELECT * FROM mytable WHERE key = 1 FOR UPDATE;
SAVEPOINT s;
UPDATE mytable SET ... WHERE key = 1;
ROLLBACK TO s;

将无法在 ROLLBACK TO 之后保留 FOR UPDATE 锁定。此问题已在 9.3 版本中修复。

would fail to preserve the FOR UPDATE lock after the ROLLBACK TO. This has been fixed in release 9.3.

Caution

READ COMMITTED 事务隔离级别下运行的 SELECT 命令使用 ORDER BY 和锁定子句返回无序行。这是因为 ORDER BY 先应用。此命令对结果排序,但随后可能阻塞试图获取一行或多行上的锁定。 SELECT 取消阻止后,某些排序列值可能已修改,从而导致这些行看似无序(尽管它们根据原始列值是有序的)。如有需要,可以通过将 FOR UPDATE/SHARE 子句放入子查询中来对此进行解决,例如

It is possible for a SELECT command running at the READ COMMITTED transaction isolation level and using ORDER BY and a locking clause to return rows out of order. This is because ORDER BY is applied first. The command sorts the result, but might then block trying to obtain a lock on one or more of the rows. Once the SELECT unblocks, some of the ordering column values might have been modified, leading to those rows appearing to be out of order (though they are in order in terms of the original column values). This can be worked around at need by placing the FOR UPDATE/SHARE clause in a sub-query, for example

SELECT * FROM (SELECT * FROM mytable FOR UPDATE) ss ORDER BY column1;

请注意,这将导致锁定 mytable 的所有行,而顶层的 FOR UPDATE 仅锁定实际返回的行。这可能会带来显着的性能差异,尤其是当 ORDER BYLIMIT 或其他限制结合使用时。因此,只有在预期对排序列进行并发更新并且需要严格排序结果时,才建议使用此技术。

Note that this will result in locking all rows of mytable, whereas FOR UPDATE at the top level would lock only the actually returned rows. This can make for a significant performance difference, particularly if the ORDER BY is combined with LIMIT or other restrictions. So this technique is recommended only if concurrent updates of the ordering columns are expected and a strictly sorted result is required.

REPEATABLE READSERIALIZABLE 事务隔离级别,这将导致序列化失败( SQLSTATE'40001' ),因此在这些隔离级别下不可能收到无序行。

At the REPEATABLE READ or SERIALIZABLE transaction isolation level this would cause a serialization failure (with an SQLSTATE of '40001'), so there is no possibility of receiving rows out of order under these isolation levels.

TABLE Command

命令

The command

TABLE name

等效于

is equivalent to

SELECT * FROM name

它可以用作顶层命令,也可以用作复杂查询部分中的节省空间的语法变体。只有 WITHUNIONINTERSECTEXCEPTORDER BYLIMITOFFSETFETCHFOR 锁定子句可以与 TABLE 一起使用。 WHERE 子句和任何形式的聚合都不能用。

It can be used as a top-level command or as a space-saving syntax variant in parts of complex queries. Only the WITH, UNION, INTERSECT, EXCEPT, ORDER BY, LIMIT, OFFSET, FETCH and FOR locking clauses can be used with TABLE; the WHERE clause and any form of aggregation cannot be used.

Examples

要连接表 films 和表 distributors

To join the table films with the table distributors:

SELECT f.title, f.did, d.name, f.date_prod, f.kind
    FROM distributors d JOIN films f USING (did);

       title       | did |     name     | date_prod  |   kind
-------------------+-----+--------------+------------+----------
 The Third Man     | 101 | British Lion | 1949-12-23 | Drama
 The African Queen | 101 | British Lion | 1951-08-11 | Romantic
 ...

要对所有电影的列 len 求和,并按 kind 分组结果:

To sum the column len of all films and group the results by kind:

SELECT kind, sum(len) AS total FROM films GROUP BY kind;

   kind   | total
----------+-------
 Action   | 07:34
 Comedy   | 02:58
 Drama    | 14:28
 Musical  | 06:42
 Romantic | 04:38

要对所有电影的列 len 求和,按 kind 分组结果,并显示小于 5 小时的组总数:

To sum the column len of all films, group the results by kind and show those group totals that are less than 5 hours:

SELECT kind, sum(len) AS total
    FROM films
    GROUP BY kind
    HAVING sum(len) < interval '5 hours';

   kind   | total
----------+-------
 Comedy   | 02:58
 Romantic | 04:38

以下两个示例是根据第二列( name )的内容对各个结果进行排序的相同方式:

The following two examples are identical ways of sorting the individual results according to the contents of the second column (name):

SELECT * FROM distributors ORDER BY name;
SELECT * FROM distributors ORDER BY 2;

 did |       name
-----+------------------
 109 | 20th Century Fox
 110 | Bavaria Atelier
 101 | British Lion
 107 | Columbia
 102 | Jean Luc Godard
 113 | Luso films
 104 | Mosfilm
 103 | Paramount
 106 | Toho
 105 | United Artists
 111 | Walt Disney
 112 | Warner Bros.
 108 | Westward

下一个示例展示了如何获取表 distributorsactors 的并集,并将其限制为每个表中以字母 W 开头的结果。只想要不同的行,因此省略了关键字 ALL

The next example shows how to obtain the union of the tables distributors and actors, restricting the results to those that begin with the letter W in each table. Only distinct rows are wanted, so the key word ALL is omitted.

distributors:               actors:
 did |     name              id |     name
-----+--------------        ----+----------------
 108 | Westward               1 | Woody Allen
 111 | Walt Disney            2 | Warren Beatty
 112 | Warner Bros.           3 | Walter Matthau
 ...                         ...

SELECT distributors.name
    FROM distributors
    WHERE distributors.name LIKE 'W%'
UNION
SELECT actors.name
    FROM actors
    WHERE actors.name LIKE 'W%';

      name
----------------
 Walt Disney
 Walter Matthau
 Warner Bros.
 Warren Beatty
 Westward
 Woody Allen

本示例展示了如何在 FROM 子句中使用函数,既可以使用列定义列表,也可以不使用列定义列表:

This example shows how to use a function in the FROM clause, both with and without a column definition list:

CREATE FUNCTION distributors(int) RETURNS SETOF distributors AS $$
    SELECT * FROM distributors WHERE did = $1;
$$ LANGUAGE SQL;

SELECT * FROM distributors(111);
 did |    name
-----+-------------
 111 | Walt Disney

CREATE FUNCTION distributors_2(int) RETURNS SETOF record AS $$
    SELECT * FROM distributors WHERE did = $1;
$$ LANGUAGE SQL;

SELECT * FROM distributors_2(111) AS (f1 int, f2 text);
 f1  |     f2
-----+-------------
 111 | Walt Disney

以下是一个带有序数列的函数示例:

Here is an example of a function with an ordinality column added:

SELECT * FROM unnest(ARRAY['a','b','c','d','e','f']) WITH ORDINALITY;
 unnest | ordinality
--------+----------
 a      |        1
 b      |        2
 c      |        3
 d      |        4
 e      |        5
 f      |        6
(6 rows)

此示例展示了如何使用简单的 WITH 子句:

This example shows how to use a simple WITH clause:

WITH t AS (
    SELECT random() as x FROM generate_series(1, 3)
  )
SELECT * FROM t
UNION ALL
SELECT * FROM t;
         x
--------------------
  0.534150459803641
  0.520092216785997
 0.0735620250925422
  0.534150459803641
  0.520092216785997
 0.0735620250925422

请注意, WITH 查询仅计算了一次,因此我们得到了两组相同的三组随机值。

Notice that the WITH query was evaluated only once, so that we got two sets of the same three random values.

此示例使用 WITH RECURSIVE 从仅显示直接下属的表中查找员工 Mary 的所有下属(直接或间接),以及他们的间接级别:

This example uses WITH RECURSIVE to find all subordinates (direct or indirect) of the employee Mary, and their level of indirectness, from a table that shows only direct subordinates:

WITH RECURSIVE employee_recursive(distance, employee_name, manager_name) AS (
    SELECT 1, employee_name, manager_name
    FROM employee
    WHERE manager_name = 'Mary'
  UNION ALL
    SELECT er.distance + 1, e.employee_name, e.manager_name
    FROM employee_recursive er, employee e
    WHERE er.employee_name = e.manager_name
  )
SELECT distance, employee_name FROM employee_recursive;

请注意递归查询的典型形式:初始条件,然后是 UNION ,然后是查询的递归部分。务必确保查询的递归部分最终不返回任何元组,否则查询将无限循环。(请参阅 Section 7.8 了解更多示例。)

Notice the typical form of recursive queries: an initial condition, followed by UNION, followed by the recursive part of the query. Be sure that the recursive part of the query will eventually return no tuples, or else the query will loop indefinitely. (See Section 7.8 for more examples.)

此示例使用 LATERALmanufacturers 表的每一行应用返回集的函数 get_product_names()

This example uses LATERAL to apply a set-returning function get_product_names() for each row of the manufacturers table:

SELECT m.name AS mname, pname
FROM manufacturers m, LATERAL get_product_names(m.id) pname;

当前没有任何产品的制造商将不会出现在结果中,因为它是一个内连接。如果我们希望在结果中包含这些制造商的名称,我们可以执行:

Manufacturers not currently having any products would not appear in the result, since it is an inner join. If we wished to include the names of such manufacturers in the result, we could do:

SELECT m.name AS mname, pname
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true;

Compatibility

当然, SELECT 语句与 SQL 标准兼容。但有一些扩展和一些缺失的功能。

Of course, the SELECT statement is compatible with the SQL standard. But there are some extensions and some missing features.

Omitted FROM Clauses

PostgreSQL 允许省略 FROM 子句。它可以用一个简单的用途来计算简单表达式的结果:

PostgreSQL allows one to omit the FROM clause. It has a straightforward use to compute the results of simple expressions:

SELECT 2+2;

 ?column?
----------
        4

除了引入一个用于执行 SELECT 的虚拟单行表,其他一些 SQL 数据库不能执行此操作。

Some other SQL databases cannot do this except by introducing a dummy one-row table from which to do the SELECT.

Empty SELECT Lists

SELECT 后的输出表达式列表可以为空,生成零列结果表。这并非 SQL 标准所允许的语法。PostgreSQL 允许它与允许零列表保持一致。但是,当使用 DISTINCT 时,不允许使用空列表。

The list of output expressions after SELECT can be empty, producing a zero-column result table. This is not valid syntax according to the SQL standard. PostgreSQL allows it to be consistent with allowing zero-column tables. However, an empty list is not allowed when DISTINCT is used.

Omitting the AS Key Word

在 SQL 标准中,当新列名为有效列名(即不与任何保留关键字相同)时,新列名前可以省略可选关键字 AS 。PostgreSQL 略有严格限制:如果新列名与任何关键字匹配(保留或不保留),则 AS 是必需的。建议的做法是使用 AS 或双引号输出列名,以防止与以后添加的关键字发生冲突。

In the SQL standard, the optional key word AS can be omitted before an output column name whenever the new column name is a valid column name (that is, not the same as any reserved keyword). PostgreSQL is slightly more restrictive: AS is required if the new column name matches any keyword at all, reserved or not. Recommended practice is to use AS or double-quote output column names, to prevent any possible conflict against future keyword additions.

FROM 项中,标准和 PostgreSQL 都允许在别名为未保留关键字时在别名前省略 AS 。但这对于输出列名来说是不切实际的,因为存在语法歧义。

In FROM items, both the standard and PostgreSQL allow AS to be omitted before an alias that is an unreserved keyword. But this is impractical for output column names, because of syntactic ambiguities.

Omitting Sub-SELECT Aliases in FROM

根据 SQL 标准, FROM 列表中的子 SELECT 必须具有别名。在 PostgreSQL 中,可以省略此别名。

According to the SQL standard, a sub-SELECT in the FROM list must have an alias. In PostgreSQL, this alias may be omitted.

ONLY and Inheritance

SQL 标准要求在写 ONLY 时在表名前加上括号,例如 SELECT * FROM ONLY (tab1), ONLY (tab2) WHERE …​ 。PostgreSQL 认为这些括号是可选的。

The SQL standard requires parentheses around the table name when writing ONLY, for example SELECT * FROM ONLY (tab1), ONLY (tab2) WHERE …​. PostgreSQL considers these parentheses to be optional.

PostgreSQL 允许写一个 * 的尾部,以便显式指定包括子表的非 ONLY 行为。该标准不允许此操作。

PostgreSQL allows a trailing * to be written to explicitly specify the non-ONLY behavior of including child tables. The standard does not allow this.

(这些要点同样适用于所有支持 ONLY 选项的 SQL 命令。)

(These points apply equally to all SQL commands supporting the ONLY option.)

TABLESAMPLE Clause Restrictions

当前只接受对普通表和物化视图的 TABLESAMPLE 子句。根据 SQL 标准,应该可以将它应用于任何 FROM 项。

The TABLESAMPLE clause is currently accepted only on regular tables and materialized views. According to the SQL standard it should be possible to apply it to any FROM item.

Function Calls in FROM

PostgreSQL 允许将函数调用直接写成 FROM 列表的成员。在 SQL 标准中,需要将此类函数调用包装在子 SELECT 中;也就是说,语法 FROM _func (…​){@s22} 被视为隐式的;这是因为该标准要求 FROM 中的 UNNEST() 项具有 LATERAL 语义。PostgreSQL 将 UNNEST() 与其他返回集的函数以同样的方式对待。

PostgreSQL allows a function call to be written directly as a member of the FROM list. In the SQL standard it would be necessary to wrap such a function call in a sub-SELECT; that is, the syntax FROM _func(…​) alias is approximately equivalent to _FROM LATERAL (SELECT _func(…​)) alias. Note that _LATERAL is considered to be implicit; this is because the standard requires LATERAL semantics for an UNNEST() item in FROM. PostgreSQL treats UNNEST() the same as other set-returning functions.

Namespace Available to GROUP BY and ORDER BY

在 SQL-92 标准中, ORDER BY 子句只能使用输出列名或数字,而 GROUP BY 子句只能使用基于输入列名的表达式。PostgreSQL 将这每一个子句扩展为还允许另一种选择(但如果存在歧义,则使用标准的解释)。PostgreSQL 还允许这两个子句指定任意表达式。请注意,表达式中出现的名称始终将被视为输入列名,而不是输出列名。

In the SQL-92 standard, an ORDER BY clause can only use output column names or numbers, while a GROUP BY clause can only use expressions based on input column names. PostgreSQL extends each of these clauses to allow the other choice as well (but it uses the standard’s interpretation if there is ambiguity). PostgreSQL also allows both clauses to specify arbitrary expressions. Note that names appearing in an expression will always be taken as input-column names, not as output-column names.

SQL:1999 及更高版本使用略有不同的定义,该定义与 SQL-92 并不完全向上兼容。但是在大多数情况下,PostgreSQL 会以与 SQL:1999 相同的方式解释 ORDER BYGROUP BY 表达式。

SQL:1999 and later use a slightly different definition which is not entirely upward compatible with SQL-92. In most cases, however, PostgreSQL will interpret an ORDER BY or GROUP BY expression the same way SQL:1999 does.

Functional Dependencies

当表的主键包含在 GROUP BY 列表中时,PostgreSQL 仅认可实用相关性(允许从 GROUP BY 中省略列)。SQL 标准指定应认可的其他条件。

PostgreSQL recognizes functional dependency (allowing columns to be omitted from GROUP BY) only when a table’s primary key is included in the GROUP BY list. The SQL standard specifies additional conditions that should be recognized.

LIMIT and OFFSET

子句 LIMITOFFSET 是 PostgreSQL 特有的语法,MySQL 也使用该语法。SQL:2008 标准已引入子句 OFFSET …​ FETCH {FIRST|NEXT} …​ 以实现相同的功能,如上例 LIMIT Clause 所示。该语法也由 IBM DB2 使用。(为 Oracle 编写的应用程序经常使用涉及自动生成的 rownum 列的解决方法,PostgreSQL 中不提供此列,以实现这些子句的效果。)

The clauses LIMIT and OFFSET are PostgreSQL-specific syntax, also used by MySQL. The SQL:2008 standard has introduced the clauses OFFSET …​ FETCH {FIRST|NEXT} …​ for the same functionality, as shown above in LIMIT Clause. This syntax is also used by IBM DB2. (Applications written for Oracle frequently use a workaround involving the automatically generated rownum column, which is not available in PostgreSQL, to implement the effects of these clauses.)

FOR NO KEY UPDATE, FOR UPDATE, FOR SHARE, FOR KEY SHARE

尽管 FOR UPDATE 出现于 SQL 标准中,但该标准仅允许将其用作 DECLARE CURSOR 的选项。不管在任何 SELECT 查询中,PostgreSQL 都允许它,以及在 FOR SHAREFOR KEY SHARE 变体中,以及 NOWAITSKIP LOCKED 选项中,它们未在该标准中显示。

Although FOR UPDATE appears in the SQL standard, the standard allows it only as an option of DECLARE CURSOR. PostgreSQL allows it in any SELECT query as well as in sub-SELECT_s, but this is an extension. The _FOR NO KEY UPDATE, FOR SHARE and FOR KEY SHARE variants, as well as the NOWAIT and SKIP LOCKED options, do not appear in the standard.

Data-Modifying Statements in WITH

PostgreSQL 允许 INSERTUPDATEDELETE 用作 WITH 查询。这不在 SQL 标准中。

PostgreSQL allows INSERT, UPDATE, and DELETE to be used as WITH queries. This is not found in the SQL standard.

Nonstandard Clauses

DISTINCT ON ( …​ ) 是 SQL 标准的扩展。

DISTINCT ON ( …​ ) is an extension of the SQL standard.

ROWS FROM( …​ ) 是 SQL 标准的扩展。

ROWS FROM( …​ ) is an extension of the SQL standard.

MATERIALIZEDNOT MATERIALIZED 选项是 WITH 的 SQL 标准扩展。

The MATERIALIZED and NOT MATERIALIZED options of WITH are extensions of the SQL standard.