Postgresql 中文操作指南
38.5. Query Language (SQL) Functions #
SQL 函数执行任意的 SQL 语句列表,返回列表中最后查询的结果。在简单(非集合)的情况下,将返回最后查询结果的第一行。(请记住,除非你使用 ORDER BY,否则多行结果的“第一行”并不能很好地定义。)如果最后一个查询恰好没有返回任何行,将返回 null 值。
SQL functions execute an arbitrary list of SQL statements, returning the result of the last query in the list. In the simple (non-set) case, the first row of the last query’s result will be returned. (Bear in mind that “the first row” of a multirow result is not well-defined unless you use ORDER BY.) If the last query happens to return no rows at all, the null value will be returned.
或者,可以通过将函数的返回类型指定为 SETOF _sometype_ 指定 SQL 函数以返回集合(即多行),也可以通过将其声明为 RETURNS TABLE(_columns )_ 指定 SQL 函数以返回集合。在这种情况下,将返回最后一次查询的所有行结果。以下提供详细信息。
Alternatively, an SQL function can be declared to return a set (that is, multiple rows) by specifying the function’s return type as SETOF _sometype_, or equivalently by declaring it as RETURNS TABLE(_columns)_. In this case all rows of the last query’s result are returned. Further details appear below.
SQL 函数的主体必须是一系列以分号分隔的 SQL 语句。最后一个语句后面的分号是可选的。除非声明函数返回 void,否则最后一个语句必须是 SELECT,或有 RETURNING 子句的 INSERT、UPDATE 或 DELETE。
The body of an SQL function must be a list of SQL statements separated by semicolons. A semicolon after the last statement is optional. Unless the function is declared to return void, the last statement must be a SELECT, or an INSERT, UPDATE, or DELETE that has a RETURNING clause.
SQL 语言中的任何命令集合都可以打包在一起并定义为函数。除了 SELECT 查询之外,命令还可以包括数据修改查询(INSERT、UPDATE、DELETE 和 MERGE),以及其他 SQL 命令。(你无法在 SQL 函数中使用事务控制命令,例如 COMMIT、SAVEPOINT 和一些实用程序命令,例如 VACUUM。)但是,最后一个命令必须是 SELECT,或者有一个 RETURNING 子句返回函数返回类型中指定的所有内容。或者,如果你想要定义一个执行动作但没有返回值的 SQL 函数,你可以定义它为返回 void。例如,此函数从 emp 表中删除具有负薪水的行:
Any collection of commands in the SQL language can be packaged together and defined as a function. Besides SELECT queries, the commands can include data modification queries (INSERT, UPDATE, DELETE, and MERGE), as well as other SQL commands. (You cannot use transaction control commands, e.g., COMMIT, SAVEPOINT, and some utility commands, e.g., VACUUM, in SQL functions.) However, the final command must be a SELECT or have a RETURNING clause that returns whatever is specified as the function’s return type. Alternatively, if you want to define an SQL function that performs actions but has no useful value to return, you can define it as returning void. For example, this function removes rows with negative salaries from the emp table:
CREATE FUNCTION clean_emp() RETURNS void AS '
DELETE FROM emp
WHERE salary < 0;
' LANGUAGE SQL;
SELECT clean_emp();
clean_emp
-----------
(1 row)
你还可以将其编写为过程,从而避免返回类型的问题。例如:
You can also write this as a procedure, thus avoiding the issue of the return type. For example:
CREATE PROCEDURE clean_emp() AS '
DELETE FROM emp
WHERE salary < 0;
' LANGUAGE SQL;
CALL clean_emp();
在像这种简单的情况下,返回 void 的函数与过程之间的区别主要是风格的。但是,过程提供额外的功能,例如函数中不可用的事务控制。此外,过程是 SQL 标准,而返回 void 是 PostgreSQL 扩展。
In simple cases like this, the difference between a function returning void and a procedure is mostly stylistic. However, procedures offer additional functionality such as transaction control that is not available in functions. Also, procedures are SQL standard whereas returning void is a PostgreSQL extension.
Note
SQL 函数的整个正文会在任何部分被执行之前进行解析。虽然 SQL 函数可以包含修改系统目录的命令(例如 CREATE TABLE),但在函数中对后续命令进行解析分析期间,不会显示此类命令的效果。因此,例如,如果将 CREATE TABLE foo (…); INSERT INTO foo VALUES(…); 打包到单个 SQL 函数中,它将不会按预期工作,因为解析 INSERT 命令时 foo 还不存在。建议在这种情况下使用 PL/pgSQL 而不是 SQL 函数。
The entire body of an SQL function is parsed before any of it is executed. While an SQL function can contain commands that alter the system catalogs (e.g., CREATE TABLE), the effects of such commands will not be visible during parse analysis of later commands in the function. Thus, for example, CREATE TABLE foo (…); INSERT INTO foo VALUES(…); will not work as desired if packaged up into a single SQL function, since foo won’t exist yet when the INSERT command is parsed. It’s recommended to use PL/pgSQL instead of an SQL function in this type of situation.
CREATE FUNCTION 命令的语法要求函数体写为字符串常量。通常,对字符串常量使用美元引用(参见 Section 4.1.2.4)最为方便。如果您选择使用常规的单引号字符串常量语法,则必须在函数体中对单引号 (') 和反斜杠 (\)(假设使用转义字符串语法)加倍(参见 Section 4.1.2.1)。
The syntax of the CREATE FUNCTION command requires the function body to be written as a string constant. It is usually most convenient to use dollar quoting (see Section 4.1.2.4) for the string constant. If you choose to use regular single-quoted string constant syntax, you must double single quote marks (') and backslashes (\) (assuming escape string syntax) in the body of the function (see Section 4.1.2.1).
38.5.1. Arguments for SQL Functions #
可以使用名称或数字在函数体中引用 SQL 函数的参数。以下是两种方法的示例。
Arguments of an SQL function can be referenced in the function body using either names or numbers. Examples of both methods appear below.
要使用名称,请将函数参数声明为具有名称,然后在函数正文中直接编写该名称。如果参数名称与函数中当前 SQL 命令中的任何列名称相同,则列名称将优先。要覆盖此设置,请使用函数本身的名称限定参数名称,即 function_name.argument_name 。(如果这样会与限定列名称发生冲突,则列名称仍会获胜。您可以通过在 SQL 命令中为表选择其他别名来避免歧义。)
To use a name, declare the function argument as having a name, and then just write that name in the function body. If the argument name is the same as any column name in the current SQL command within the function, the column name will take precedence. To override this, qualify the argument name with the name of the function itself, that is function_name_._argument_name. (If this would conflict with a qualified column name, again the column name wins. You can avoid the ambiguity by choosing a different alias for the table within the SQL command.)
在较早的数字方法中,可以使用 $_n_ 语法引用参数: $1 引用第一个输入参数, $2 引用第二个输入参数,依此类推。这将有效,无论特定参数是否有名称声明。
In the older numeric approach, arguments are referenced using the syntax $_n_: $1 refers to the first input argument, $2 to the second, and so on. This will work whether or not the particular argument was declared with a name.
如果参数为复合类型,则可以使用点表示法(例如, argname.fieldname 或 $1._fieldname_ )来访问参数的属性。同样,您可能需要使用函数名称限定参数名称,以明确使用参数名称的形式。
If an argument is of a composite type, then the dot notation, e.g., argname_.fieldname or _$1._fieldname_, can be used to access attributes of the argument. Again, you might need to qualify the argument’s name with the function name to make the form with an argument name unambiguous.
SQL 函数参数只能用作数据值,不能用作标识符。因此,例如,这是合理的:
SQL function arguments can only be used as data values, not as identifiers. Thus for example this is reasonable:
INSERT INTO mytable VALUES ($1);
但是,这是不起作用的:
but this will not work:
INSERT INTO $1 VALUES (42);
Note
在 PostgreSQL 9.2 中添加了使用名称引用 SQL 函数参数的功能。要在较低版本服务器中使用函数,必须使用 $_n_ 表示法。
The ability to use names to reference SQL function arguments was added in PostgreSQL 9.2. Functions to be used in older servers must use the $_n_ notation.
38.5.2. SQL Functions on Base Types #
最简单的可能 SQL 函数没有任何参数,只是返回基本类型,如 integer:
The simplest possible SQL function has no arguments and simply returns a base type, such as integer:
CREATE FUNCTION one() RETURNS integer AS $$
SELECT 1 AS result;
$$ LANGUAGE SQL;
-- Alternative syntax for string literal:
CREATE FUNCTION one() RETURNS integer AS '
SELECT 1 AS result;
' LANGUAGE SQL;
SELECT one();
one
-----
1
请注意,我们在函数体内为函数的结果定义了一个列别名(名称为 result),但是此列别名在函数外部不可见。因此,结果被标记为 one 而不是 result。
Notice that we defined a column alias within the function body for the result of the function (with the name result), but this column alias is not visible outside the function. Hence, the result is labeled one instead of result.
定义将基本类型作为参数的 SQL 函数几乎一样简单:
It is almost as easy to define SQL functions that take base types as arguments:
CREATE FUNCTION add_em(x integer, y integer) RETURNS integer AS $$
SELECT x + y;
$$ LANGUAGE SQL;
SELECT add_em(1, 2) AS answer;
answer
--------
3
或者,我们可以省去参数的名称并使用数字:
Alternatively, we could dispense with names for the arguments and use numbers:
CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
SELECT $1 + $2;
$$ LANGUAGE SQL;
SELECT add_em(1, 2) AS answer;
answer
--------
3
下面是一个更有用的函数,可用于借记银行账户:
Here is a more useful function, which might be used to debit a bank account:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
UPDATE bank
SET balance = balance - debit
WHERE accountno = tf1.accountno;
SELECT 1;
$$ LANGUAGE SQL;
用户可以通过以下方式执行此函数,将 17 号账户借记 100.00 美元:
A user could execute this function to debit account 17 by $100.00 as follows:
SELECT tf1(17, 100.0);
在此示例中,我们为第一个参数选择了 accountno 名称,但此名称与 bank 表中的列名称相同。在 UPDATE 命令中,accountno 表示列 bank.accountno,因此必须使用 tf1.accountno 来表示该参数。当然,我们可以通过为参数使用不同的名称来避免这一点。
In this example, we chose the name accountno for the first argument, but this is the same as the name of a column in the bank table. Within the UPDATE command, accountno refers to the column bank.accountno, so tf1.accountno must be used to refer to the argument. We could of course avoid this by using a different name for the argument.
实际上,人们可能希望从函数中获得比常量 1 更有用的结果,因此更可能的定义是:
In practice one would probably like a more useful result from the function than a constant 1, so a more likely definition is:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
UPDATE bank
SET balance = balance - debit
WHERE accountno = tf1.accountno;
SELECT balance FROM bank WHERE accountno = tf1.accountno;
$$ LANGUAGE SQL;
它调整余额并返回新余额。可以使用 RETURNING 在一个命令中完成相同的事情:
which adjusts the balance and returns the new balance. The same thing could be done in one command using RETURNING:
CREATE FUNCTION tf1 (accountno integer, debit numeric) RETURNS numeric AS $$
UPDATE bank
SET balance = balance - debit
WHERE accountno = tf1.accountno
RETURNING balance;
$$ LANGUAGE SQL;
如果 SQL 函数中的最终 SELECT 或 RETURNING 子句没有确切返回函数声明的结果类型,PostgreSQL 将自动将该值强制转换为所需类型(如果可以通过隐式或赋值强制转换实现这一点)。否则,你必须编写显式 cast。例如,假设我们想要前一个 add_em 函数返回类型 float8。编写
If the final SELECT or RETURNING clause in an SQL function does not return exactly the function’s declared result type, PostgreSQL will automatically cast the value to the required type, if that is possible with an implicit or assignment cast. Otherwise, you must write an explicit cast. For example, suppose we wanted the previous add_em function to return type float8 instead. It’s sufficient to write
CREATE FUNCTION add_em(integer, integer) RETURNS float8 AS $$
SELECT $1 + $2;
$$ LANGUAGE SQL;
因为 integer sum 可以隐式转换为 float8 。(有关转换的更多信息,请参见 Chapter 10 或 CREATE CAST 。)
since the integer sum can be implicitly cast to float8. (See Chapter 10 or CREATE CAST for more about casts.)
38.5.3. SQL Functions on Composite Types #
在使用复合类型作为参数编写函数时,我们不仅必须指定所需的参数,还必须指定该参数的所需属性(字段)。例如,假设 emp 是包含员工数据的表,因此也是表中每行的复合类型的名称。以下函数 double_salary 计算某人的工资如果加倍后是多少:
When writing functions with arguments of composite types, we must not only specify which argument we want but also the desired attribute (field) of that argument. For example, suppose that emp is a table containing employee data, and therefore also the name of the composite type of each row of the table. Here is a function double_salary that computes what someone’s salary would be if it were doubled:
CREATE TABLE emp (
name text,
salary numeric,
age integer,
cubicle point
);
INSERT INTO emp VALUES ('Bill', 4200, 45, '(2,1)');
CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
SELECT $1.salary * 2 AS salary;
$$ LANGUAGE SQL;
SELECT name, double_salary(emp.*) AS dream
FROM emp
WHERE emp.cubicle ~= point '(2,1)';
name | dream
------+-------
Bill | 8400
请注意使用 $1.salary 语法选择参数行值的一个字段。另请注意调用 SELECT 命令如何使用 table_name_.*_ 将表的一行作为复合值进行选择。表行同样可以使用表名进行引用,如下所示:
Notice the use of the syntax $1.salary to select one field of the argument row value. Also notice how the calling SELECT command uses table_name_.*_ to select the entire current row of a table as a composite value. The table row can alternatively be referenced using just the table name, like this:
SELECT name, double_salary(emp) AS dream
FROM emp
WHERE emp.cubicle ~= point '(2,1)';
但是这种用法已弃用,因为它容易混淆。(有关表格行的复合值的这两种表示法的详细信息,请参见 Section 8.16.5。)
but this usage is deprecated since it’s easy to get confused. (See Section 8.16.5 for details about these two notations for the composite value of a table row.)
有时,临时构造复合参数值非常方便。可以使用 ROW 构造执行此操作。例如,我们可以调整传递给函数的数据:
Sometimes it is handy to construct a composite argument value on-the-fly. This can be done with the ROW construct. For example, we could adjust the data being passed to the function:
SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
FROM emp;
还可以构建返回复合类型的函数。以下是一个返回单个 emp 行的函数示例:
It is also possible to build a function that returns a composite type. This is an example of a function that returns a single emp row:
CREATE FUNCTION new_emp() RETURNS emp AS $$
SELECT text 'None' AS name,
1000.0 AS salary,
25 AS age,
point '(2,2)' AS cubicle;
$$ LANGUAGE SQL;
在此示例中,我们使用常量值指定了每个属性,但任何计算都可以替代这些常量。
In this example we have specified each of the attributes with a constant value, but any computation could have been substituted for these constants.
定义函数时需要注意两个重要事项:
Note two important things about defining the function:
定义同一函数的另一种方法是:
A different way to define the same function is:
CREATE FUNCTION new_emp() RETURNS emp AS $$
SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
$$ LANGUAGE SQL;
在这里,我们编写的 SELECT 只返回正确复合类型的一列。这在这种情况中并不会更好,但在某些情况下,它是一个方便的替代方法,例如,如果我们需要通过调用返回所需的复合值的另一个函数来计算结果。另一个示例是,如果我们尝试编写一个将域返回为复合而非纯复合类型的函数,则必须始终编写它以返回单列,因为无法强制转换整个行的结果。
Here we wrote a SELECT that returns just a single column of the correct composite type. This isn’t really better in this situation, but it is a handy alternative in some cases — for example, if we need to compute the result by calling another function that returns the desired composite value. Another example is that if we are trying to write a function that returns a domain over composite, rather than a plain composite type, it is always necessary to write it as returning a single column, since there is no way to cause a coercion of the whole row result.
我们可以使用它在值表达式中直接调用此函数:
We could call this function directly either by using it in a value expression:
SELECT new_emp();
new_emp
--------------------------
(None,1000.0,25,"(2,2)")
或以表函数的形式调用它:
or by calling it as a table function:
SELECT * FROM new_emp();
name | salary | age | cubicle
------+--------+-----+---------
None | 1000.0 | 25 | (2,2)
第二种方式在 Section 38.5.8 中有更充分的描述。
The second way is described more fully in Section 38.5.8.
在使用返回复合类型的函数时,你可能只想从其结果中获取一个字段(属性)。你可以使用这样的语法来执行此操作:
When you use a function that returns a composite type, you might want only one field (attribute) from its result. You can do that with syntax like this:
SELECT (new_emp()).name;
name
------
None
需要额外的括号来防止解析器混淆。如果你尝试不使用括号,则会得到类似这样的内容:
The extra parentheses are needed to keep the parser from getting confused. If you try to do it without them, you get something like this:
SELECT new_emp().name;
ERROR: syntax error at or near "."
LINE 1: SELECT new_emp().name;
^
另一种选择是使用函数符号进行提取属性:
Another option is to use functional notation for extracting an attribute:
SELECT name(new_emp());
name
------
None
正如 Section 8.16.5 中所解释的,字段表示法和函数表示法是等效的。
As explained in Section 8.16.5, the field notation and functional notation are equivalent.
使用返回复合类型的函数的另一种方法是将结果传递给另一个接受正确的行类型作为输入的函数:
Another way to use a function returning a composite type is to pass the result to another function that accepts the correct row type as input:
CREATE FUNCTION getname(emp) RETURNS text AS $$
SELECT $1.name;
$$ LANGUAGE SQL;
SELECT getname(new_emp());
getname
---------
None
(1 row)
38.5.4. SQL Functions with Output Parameters #
描述函数结果的另一种方法是使用 output parameters 对其进行定义,如下例所示:
An alternative way of describing a function’s results is to define it with output parameters, as in this example:
CREATE FUNCTION add_em (IN x int, IN y int, OUT sum int)
AS 'SELECT x + y'
LANGUAGE SQL;
SELECT add_em(3,7);
add_em
--------
10
(1 row)
这与 Section 38.5.2 中所示的 add_em 版本没有本质上的不同。输出参数的真正价值在于它们提供了一种方便的方式来定义返回多列的函数。例如:
This is not essentially different from the version of add_em shown in Section 38.5.2. The real value of output parameters is that they provide a convenient way of defining functions that return several columns. For example,
CREATE FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int)
AS 'SELECT x + y, x * y'
LANGUAGE SQL;
SELECT * FROM sum_n_product(11,42);
sum | product
-----+---------
53 | 462
(1 row)
这里实际发生的事情是,我们为函数的结果创建了一个匿名复合类型。上述示例的最终结果与
What has essentially happened here is that we have created an anonymous composite type for the result of the function. The above example has the same end result as
CREATE TYPE sum_prod AS (sum int, product int);
CREATE FUNCTION sum_n_product (int, int) RETURNS sum_prod
AS 'SELECT $1 + $2, $1 * $2'
LANGUAGE SQL;
但通常很方便,就不用为单独的复合类型定义而烦恼。注意,附加到输出参数上的名称不仅仅是装饰,而是确定匿名复合类型的列名称。(如果你为输出参数省略名称,系统将自行选择一个名称。)
but not having to bother with the separate composite type definition is often handy. Notice that the names attached to the output parameters are not just decoration, but determine the column names of the anonymous composite type. (If you omit a name for an output parameter, the system will choose a name on its own.)
请注意,从 SQL 调用此类函数时,输出参数不包含在调用参数列表中。这是因为 PostgreSQL 仅考虑输入参数来定义函数的调用签名。这意味着只有在引用函数(例如删除函数)时,输入参数才重要。我们可以使用以下任一方法删除上述函数:
Notice that output parameters are not included in the calling argument list when invoking such a function from SQL. This is because PostgreSQL considers only the input parameters to define the function’s calling signature. That means also that only the input parameters matter when referencing the function for purposes such as dropping it. We could drop the above function with either of
DROP FUNCTION sum_n_product (x int, y int, OUT sum int, OUT product int);
DROP FUNCTION sum_n_product (int, int);
参数可以标记为 IN(默认)、OUT、INOUT 或 VARIADIC。INOUT 参数既可以作为输入参数(调用参数列表的一部分),也可以作为输出参数(结果记录类型的一部分)。VARIADIC 参数是输入参数,但会按照以下描述特别处理。
Parameters can be marked as IN (the default), OUT, INOUT, or VARIADIC. An INOUT parameter serves as both an input parameter (part of the calling argument list) and an output parameter (part of the result record type). VARIADIC parameters are input parameters, but are treated specially as described below.
38.5.5. SQL Procedures with Output Parameters #
输出参数也在过程中得到支持,但它们的工作方式与函数稍有不同。在 CALL 命令中,输出参数必须包含在参数列表中。例如,可以这样编写前面提到的银行帐户借记例程:
Output parameters are also supported in procedures, but they work a bit differently from functions. In CALL commands, output parameters must be included in the argument list. For example, the bank account debiting routine from earlier could be written like this:
CREATE PROCEDURE tp1 (accountno integer, debit numeric, OUT new_balance numeric) AS $$
UPDATE bank
SET balance = balance - debit
WHERE accountno = tp1.accountno
RETURNING balance;
$$ LANGUAGE SQL;
要调用此过程,必须包括与 OUT 参数匹配的参数。通常会编写 NULL:
To call this procedure, an argument matching the OUT parameter must be included. It’s customary to write NULL:
CALL tp1(17, 100.0, NULL);
如果你编写其他内容,则它必须是一个表达式,该表达式可以隐式强制转换为参数声明的类型,就像输入参数一样。但请注意,该表达式将不会被求值。
If you write something else, it must be an expression that is implicitly coercible to the declared type of the parameter, just as for input parameters. Note however that such an expression will not be evaluated.
在从 PL/pgSQL 调用过程时,您必须编写一个将接收过程输出的变量,而不是编写 NULL。有关详细信息,请参见 Section 43.6.3。
When calling a procedure from PL/pgSQL, instead of writing NULL you must write a variable that will receive the procedure’s output. See Section 43.6.3 for details.
38.5.6. SQL Functions with Variable Numbers of Arguments #
SQL 函数可以声明为接受可变数量的参数,只要所有“可选”参数都是相同的数据类型即可。可选参数将作为数组传递给函数。通过将最后一个参数标记为 VARIADIC 来声明函数;此参数必须声明为数组类型。例如:
SQL functions can be declared to accept variable numbers of arguments, so long as all the “optional” arguments are of the same data type. The optional arguments will be passed to the function as an array. The function is declared by marking the last parameter as VARIADIC; this parameter must be declared as being of an array type. For example:
CREATE FUNCTION mleast(VARIADIC arr numeric[]) RETURNS numeric AS $$
SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
$$ LANGUAGE SQL;
SELECT mleast(10, -1, 5, 4.4);
mleast
--------
-1
(1 row)
实际上,VARIADIC 位置或此位置之后的实际参数全部收集到一维数组中,就像你编写以下内容一样:
Effectively, all the actual arguments at or beyond the VARIADIC position are gathered up into a one-dimensional array, as if you had written
SELECT mleast(ARRAY[10, -1, 5, 4.4]); -- doesn't work
不过,你实际上无法编写该内容,或者至少它不会与此函数定义相匹配。标记为 VARIADIC 的参数匹配其元素类型的出现一次或多次,而不是自身的类型。
You can’t actually write that, though — or at least, it will not match this function definition. A parameter marked VARIADIC matches one or more occurrences of its element type, not of its own type.
有时,能够将已构建的数组传递给可变参数函数非常有用;当一个可变参数函数想将其数组参数传递给另一个可变参数函数时,这尤其方便。而且,这是调用受信任用户可以在其中创建对象的架构中发现的可变参数函数的唯一安全方式;请参阅 Section 10.3。您可以在调用中指定 VARIADIC 来执行此操作:
Sometimes it is useful to be able to pass an already-constructed array to a variadic function; this is particularly handy when one variadic function wants to pass on its array parameter to another one. Also, this is the only secure way to call a variadic function found in a schema that permits untrusted users to create objects; see Section 10.3. You can do this by specifying VARIADIC in the call:
SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
这可防止将函数的变参参数扩展到其元素类型中,从而允许数组参数值正常匹配。VARIADIC 只能附加到函数调用的最后一个实际参数。
This prevents expansion of the function’s variadic parameter into its element type, thereby allowing the array argument value to match normally. VARIADIC can only be attached to the last actual argument of a function call.
在调用中指定 VARIADIC 也是将空数组传递给变参函数的唯一方法,例如:
Specifying VARIADIC in the call is also the only way to pass an empty array to a variadic function, for example:
SELECT mleast(VARIADIC ARRAY[]::numeric[]);
仅仅编写 SELECT mleast() 是不行的,因为变参参数必须至少匹配一个实际参数。(如果你想要允许此类调用,可以定义另一个函数,该函数也称为 mleast,并且没有参数。)
Simply writing SELECT mleast() does not work because a variadic parameter must match at least one actual argument. (You could define a second function also named mleast, with no parameters, if you wanted to allow such calls.)
从可变参数产生的数组元素参数被视为没有自己的名称。这意味着不可能使用命名参数 ( Section 4.3) 调用可变参数函数,除非您指定 VARIADIC。例如,以下代码将有效:
The array element parameters generated from a variadic parameter are treated as not having any names of their own. This means it is not possible to call a variadic function using named arguments (Section 4.3), except when you specify VARIADIC. For example, this will work:
SELECT mleast(VARIADIC arr => ARRAY[10, -1, 5, 4.4]);
但这些不起作用:
but not these:
SELECT mleast(arr => 10);
SELECT mleast(arr => ARRAY[10, -1, 5, 4.4]);
38.5.7. SQL Functions with Default Values for Arguments #
可以使用某些或所有输入参数的默认值来声明函数。当函数调用时实际参数不足时,将插入默认值。因为参数只能从实际参数列表的末尾忽略,所以带有默认值的参数之后的所有参数也必须具有默认值。(虽然命名参数表示法本来可以用作放松此限制,但为了让位置参数表示法合理地工作,仍然执行了此限制。)无论您是否使用此功能,在用户之间互不信任的数据库中调用函数时,此功能都会造成预防措施的需要;请参阅 Section 10.3。
Functions can be declared with default values for some or all input arguments. The default values are inserted whenever the function is called with insufficiently many actual arguments. Since arguments can only be omitted from the end of the actual argument list, all parameters after a parameter with a default value have to have default values as well. (Although the use of named argument notation could allow this restriction to be relaxed, it’s still enforced so that positional argument notation works sensibly.) Whether or not you use it, this capability creates a need for precautions when calling functions in databases where some users mistrust other users; see Section 10.3.
例如:
For example:
CREATE FUNCTION foo(a int, b int DEFAULT 2, c int DEFAULT 3)
RETURNS int
LANGUAGE SQL
AS $$
SELECT $1 + $2 + $3;
$$;
SELECT foo(10, 20, 30);
foo
-----
60
(1 row)
SELECT foo(10, 20);
foo
-----
33
(1 row)
SELECT foo(10);
foo
-----
15
(1 row)
SELECT foo(); -- fails since there is no default for the first argument
ERROR: function foo() does not exist
也可以使用 = 符号代替关键字 DEFAULT。
The = sign can also be used in place of the key word DEFAULT.
38.5.8. SQL Functions as Table Sources #
所有 SQL 函数都可以在查询的 FROM 子句中使用,但对于返回复合类型的函数,它特别有用。如果函数被定义为返回基础类型,那么表函数将生成单列表。如果函数被定义为返回复合类型,那么表函数将为复合类型的每个属性生成一列。
All SQL functions can be used in the FROM clause of a query, but it is particularly useful for functions returning composite types. If the function is defined to return a base type, the table function produces a one-column table. If the function is defined to return a composite type, the table function produces a column for each attribute of the composite type.
这是一个示例:
Here is an example:
CREATE TABLE foo (fooid int, foosubid int, fooname text);
INSERT INTO foo VALUES (1, 1, 'Joe');
INSERT INTO foo VALUES (1, 2, 'Ed');
INSERT INTO foo VALUES (2, 1, 'Mary');
CREATE FUNCTION getfoo(int) RETURNS foo AS $$
SELECT * FROM foo WHERE fooid = $1;
$$ LANGUAGE SQL;
SELECT *, upper(fooname) FROM getfoo(1) AS t1;
fooid | foosubid | fooname | upper
-------+----------+---------+-------
1 | 1 | Joe | JOE
(1 row)
正如示例所示,我们可以像处理普通表列一样处理函数结果的列。
As the example shows, we can work with the columns of the function’s result just the same as if they were columns of a regular table.
请注意,我们只从函数中获取了一行。这是因为我们没有使用 SETOF。这将在下一节中进行描述。
Note that we only got one row out of the function. This is because we did not use SETOF. That is described in the next section.
38.5.9. SQL Functions Returning Sets #
当 SQL 函数声明为返回 SETOF _sometype_ 时,将执行函数的最终查询,并将其输出的每行作为结果集的元素进行返回。
When an SQL function is declared as returning SETOF _sometype_, the function’s final query is executed to completion, and each row it outputs is returned as an element of the result set.
通常在 FROM 子句中调用函数时使用此功能。在这种情况下,函数返回的每一行都将成为查询看到的表的一行。例如,假设表 foo 的内容与以上相同,我们说:
This feature is normally used when calling the function in the FROM clause. In this case each row returned by the function becomes a row of the table seen by the query. For example, assume that table foo has the same contents as above, and we say:
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
SELECT * FROM foo WHERE fooid = $1;
$$ LANGUAGE SQL;
SELECT * FROM getfoo(1) AS t1;
那么我们会得到:
Then we would get:
fooid | foosubid | fooname
-------+----------+---------
1 | 1 | Joe
1 | 2 | Ed
(2 rows)
也可以返回多行,其中列由输出参数定义,如下所示:
It is also possible to return multiple rows with the columns defined by output parameters, like this:
CREATE TABLE tab (y int, z int);
INSERT INTO tab VALUES (1, 2), (3, 4), (5, 6), (7, 8);
CREATE FUNCTION sum_n_product_with_tab (x int, OUT sum int, OUT product int)
RETURNS SETOF record
AS $$
SELECT $1 + tab.y, $1 * tab.y FROM tab;
$$ LANGUAGE SQL;
SELECT * FROM sum_n_product_with_tab(10);
sum | product
-----+---------
11 | 10
13 | 30
15 | 50
17 | 70
(4 rows)
此处的关键是要编写 RETURNS SETOF record,以表示函数返回多行,而不是一行。如果只有一个输出参数,请编写该参数的类型,而不是 record。
The key point here is that you must write RETURNS SETOF record to indicate that the function returns multiple rows instead of just one. If there is only one output parameter, write that parameter’s type instead of record.
通过多次调用一个集合返回函数并使用该表或子查询的连续行的参数作为每次调用的参数,来构建查询结果通常很有用。执行此操作的首选方法是使用 LATERAL 关键字,如 Section 7.2.1.5 中所述。这里有一个使用返回集合的函数来枚举树结构元素的示例:
It is frequently useful to construct a query’s result by invoking a set-returning function multiple times, with the parameters for each invocation coming from successive rows of a table or subquery. The preferred way to do this is to use the LATERAL key word, which is described in Section 7.2.1.5. Here is an example using a set-returning function to enumerate elements of a tree structure:
SELECT * FROM nodes;
name | parent
-----------+--------
Top |
Child1 | Top
Child2 | Top
Child3 | Top
SubChild1 | Child1
SubChild2 | Child1
(6 rows)
CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
SELECT name FROM nodes WHERE parent = $1
$$ LANGUAGE SQL STABLE;
SELECT * FROM listchildren('Top');
listchildren
--------------
Child1
Child2
Child3
(3 rows)
SELECT name, child FROM nodes, LATERAL listchildren(name) AS child;
name | child
--------+-----------
Top | Child1
Top | Child2
Top | Child3
Child1 | SubChild1
Child1 | SubChild2
(5 rows)
此示例并未执行我们使用简单连接无法执行的任何操作,但在更复杂的计算中,将部分工作放入函数可能会非常方便。
This example does not do anything that we couldn’t have done with a simple join, but in more complex calculations the option to put some of the work into a function can be quite convenient.
返回集合的函数也可以在查询的选择列表中调用。对于查询自行生成的每一行,都会调用集合返回函数,并为函数结果集的每个元素生成一个输出行。之前的示例也可以使用以下查询完成:
Functions returning sets can also be called in the select list of a query. For each row that the query generates by itself, the set-returning function is invoked, and an output row is generated for each element of the function’s result set. The previous example could also be done with queries like these:
SELECT listchildren('Top');
listchildren
--------------
Child1
Child2
Child3
(3 rows)
SELECT name, listchildren(name) FROM nodes;
name | listchildren
--------+--------------
Top | Child1
Top | Child2
Top | Child3
Child1 | SubChild1
Child1 | SubChild2
(5 rows)
在最后一个 SELECT 中,请注意 Child2、Child3 等没有输出行出现。这是因为 listchildren 为这些参数返回了一个空集,因此未生成结果行。这与使用 LATERAL 语法从内部连接到函数结果时获得的行为相同。
In the last SELECT, notice that no output row appears for Child2, Child3, etc. This happens because listchildren returns an empty set for those arguments, so no result rows are generated. This is the same behavior as we got from an inner join to the function result when using the LATERAL syntax.
PostgreSQL 对查询选择列表中的返回集合函数的行为几乎与将返回集合函数写在 LATERAL FROM 条款项中完全相同。例如,
PostgreSQL’s behavior for a set-returning function in a query’s select list is almost exactly the same as if the set-returning function had been written in a LATERAL FROM-clause item instead. For example,
SELECT x, generate_series(1,5) AS g FROM tab;
几乎等同于
is almost equivalent to
SELECT x, g FROM tab, LATERAL generate_series(1,5) AS g;
它将完全相同,除了在这个特定示例中,规划程序可以选择将 g 放在嵌套循环连接的外侧,因为 g 对 tab 没有实际横向依赖性。那将导致不同的输出行顺序。选择列表中的返回集合函数始终被当作在嵌套循环连接的内部以及 FROM 条款的其余部分进行评估,以便在考虑 FROM 条款的下一行之前,运行函数直至完成。
It would be exactly the same, except that in this specific example, the planner could choose to put g on the outside of the nested-loop join, since g has no actual lateral dependency on tab. That would result in a different output row order. Set-returning functions in the select list are always evaluated as though they are on the inside of a nested-loop join with the rest of the FROM clause, so that the function(s) are run to completion before the next row from the FROM clause is considered.
如果查询的选择列表中具有多个集合返回函数,则其行为类似于将这些函数放入单个 LATERAL ROWS FROM( … ) FROM 子句项中的行为。对于基础查询中的每一行,都会使用每个函数的第一个结果输出一行,然后使用第二个结果输出一行,依此类推。如果某些集合返回函数产生的输出少于其他函数的输出,那么空值将替换为缺失的数据,使得为一行基础行发出的总行数与产生最多输出的集合返回函数的输出相同。因此,这些集合返回函数会“同步”运行,直至其全部用完,然后将使用下一行基础行继续执行。
If there is more than one set-returning function in the query’s select list, the behavior is similar to what you get from putting the functions into a single LATERAL ROWS FROM( … ) FROM-clause item. For each row from the underlying query, there is an output row using the first result from each function, then an output row using the second result, and so on. If some of the set-returning functions produce fewer outputs than others, null values are substituted for the missing data, so that the total number of rows emitted for one underlying row is the same as for the set-returning function that produced the most outputs. Thus the set-returning functions run “in lockstep” until they are all exhausted, and then execution continues with the next underlying row.
集合返回函数可以嵌套在选择列表中,虽然在 FROM 子句项中不允许这样做。在这种情况下,每一级嵌套将被单独处理,如同其为单独的 LATERAL ROWS FROM( … ) 项一样。例如,在
Set-returning functions can be nested in a select list, although that is not allowed in FROM-clause items. In such cases, each level of nesting is treated separately, as though it were a separate LATERAL ROWS FROM( … ) item. For example, in
SELECT srf1(srf2(x), srf3(y)), srf4(srf5(z)) FROM tab;
中,集合返回函数 srf2、srf3 和 srf5 将对 tab 的每一行同步运行,然后 srf1 和 srf4 将对较低级函数产生的每一行同步应用。
the set-returning functions srf2, srf3, and srf5 would be run in lockstep for each row of tab, and then srf1 and srf4 would be applied in lockstep to each row produced by the lower functions.
集合返回函数不能在条件评估结构中使用,例如 CASE 或 COALESCE。例如,考虑
Set-returning functions cannot be used within conditional-evaluation constructs, such as CASE or COALESCE. For example, consider
SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
它可能看起来应该是这样生成的:带有 x > 0 的输入行重复五次,不带有 x > 0 的行重复一次;但实际上,这是因为 generate_series(1, 5) 将在 CASE 表达式评估之前在隐式 LATERAL FROM 项中运行,它将生成每一行输入行的五个重复内容。为了减少混淆,这些情况将生成一个解析时间错误。
It might seem that this should produce five repetitions of input rows that have x > 0, and a single repetition of those that do not; but actually, because generate_series(1, 5) would be run in an implicit LATERAL FROM item before the CASE expression is ever evaluated, it would produce five repetitions of every input row. To reduce confusion, such cases produce a parse-time error instead.
Note
如果函数的最后一个命令是 INSERT、UPDATE 或 DELETE 且有 RETURNING,那么即使函数未声明为 SETOF 或调用查询未获取所有结果行,也会将该命令始终执行到完成。RETURNING 子句产生的任何额外行都会被静默删除,但命令的表修改仍然会发生(而且在从函数返回之前会全部完成)。
If a function’s last command is INSERT, UPDATE, or DELETE with RETURNING, that command will always be executed to completion, even if the function is not declared with SETOF or the calling query does not fetch all the result rows. Any extra rows produced by the RETURNING clause are silently dropped, but the commanded table modifications still happen (and are all completed before returning from the function).
Note
在 PostgreSQL 10 之前,除非始终产生相等行数,否则在同一选择列表中放置多个集返回函数的行为不太合理。否则,获得的输出行数将等于集返回函数产生的行数的最小公倍数。此外,嵌套集返回函数不能按上述方式工作;相反,集返回函数最多只能有一个集返回参数,并且每个集返回函数的巢独立运行。此外,以前允许条件执行(CASE 等中的集返回函数),这进一步让事情变得复杂。在编写需要在较旧的 PostgreSQL 版本中工作的查询时,建议使用 LATERAL 语法,因为这将在不同版本间产生一致的结果。如果你依赖于集返回函数的条件执行来实现查询,则可以通过将条件测试移动到自定义的集返回函数中来修复它。例如,
Before PostgreSQL 10, putting more than one set-returning function in the same select list did not behave very sensibly unless they always produced equal numbers of rows. Otherwise, what you got was a number of output rows equal to the least common multiple of the numbers of rows produced by the set-returning functions. Also, nested set-returning functions did not work as described above; instead, a set-returning function could have at most one set-returning argument, and each nest of set-returning functions was run independently. Also, conditional execution (set-returning functions inside CASE etc.) was previously allowed, complicating things even more. Use of the LATERAL syntax is recommended when writing queries that need to work in older PostgreSQL versions, because that will give consistent results across different versions. If you have a query that is relying on conditional execution of a set-returning function, you may be able to fix it by moving the conditional test into a custom set-returning function. For example,
SELECT x, CASE WHEN y > 0 THEN generate_series(1, z) ELSE 5 END FROM tab;
可能成为
could become
CREATE FUNCTION case_generate_series(cond bool, start int, fin int, els int)
RETURNS SETOF int AS $$
BEGIN
IF cond THEN
RETURN QUERY SELECT generate_series(start, fin);
ELSE
RETURN QUERY SELECT els;
END IF;
END$$ LANGUAGE plpgsql;
SELECT x, case_generate_series(y > 0, 1, z, 5) FROM tab;
此公式在所有版本的 PostgreSQL 中都可以使用。
This formulation will work the same in all versions of PostgreSQL.
38.5.10. SQL Functions Returning TABLE #
还有另一种声明函数返回集合的方式,即使用语法 RETURNS TABLE(_columns). This is equivalent to using one or more _OUT 参数以及将函数标记为返回 SETOF record (或 SETOF 根据需要,单个输出参数的类型)。这种表示法在最近版本的 SQL 标准中得到了规定,因此比使用 SETOF 具有更好的可移植性。
There is another way to declare a function as returning a set, which is to use the syntax RETURNS TABLE(_columns). This is equivalent to using one or more _OUT parameters plus marking the function as returning SETOF record (or SETOF a single output parameter’s type, as appropriate). This notation is specified in recent versions of the SQL standard, and thus may be more portable than using SETOF.
例如,前面的求和与乘积示例也可以这样完成:
For example, the preceding sum-and-product example could also be done this way:
CREATE FUNCTION sum_n_product_with_tab (x int)
RETURNS TABLE(sum int, product int) AS $$
SELECT $1 + tab.y, $1 * tab.y FROM tab;
$$ LANGUAGE SQL;
不允许使用显式的 OUT 或 INOUT 参数与 RETURNS TABLE 表示法,您必须将所有输出列放在 TABLE 列表中。
It is not allowed to use explicit OUT or INOUT parameters with the RETURNS TABLE notation — you must put all the output columns in the TABLE list.
38.5.11. Polymorphic SQL Functions #
可以声明 SQL 函数以接受和返回 Section 38.2.5 中描述的多态类型。以下为一种多态函数 make_array,它使用两个任意数据类型元素构建数组:
SQL functions can be declared to accept and return the polymorphic types described in Section 38.2.5. Here is a polymorphic function make_array that builds up an array from two arbitrary data type elements:
CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
SELECT ARRAY[$1, $2];
$$ LANGUAGE SQL;
SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
intarray | textarray
----------+-----------
{1,2} | {a,b}
(1 row)
请注意使用类型强制转换 'a'::text 来指定参数的类型为 text。如果参数只是一个字符串文本,则需要进行此步骤,否则它将被视为类型 unknown,而 unknown 的数组不是有效类型。如果不进行类型强制转换,您将收到类似这样的错误:
Notice the use of the typecast 'a'::text to specify that the argument is of type text. This is required if the argument is just a string literal, since otherwise it would be treated as type unknown, and array of unknown is not a valid type. Without the typecast, you will get errors like this:
ERROR: could not determine polymorphic type because input has type unknown
通过上面声明 make_array,您必须提供两个完全相同数据类型参数;系统不会尝试解决任何类型差异。因此,例如,以下内容不起作用:
With make_array declared as above, you must provide two arguments that are of exactly the same data type; the system will not attempt to resolve any type differences. Thus for example this does not work:
SELECT make_array(1, 2.5) AS numericarray;
ERROR: function make_array(integer, numeric) does not exist
另一种方法是使用“通用”多态类型族,它允许系统尝试识别合适的通用类型:
An alternative approach is to use the “common” family of polymorphic types, which allows the system to try to identify a suitable common type:
CREATE FUNCTION make_array2(anycompatible, anycompatible)
RETURNS anycompatiblearray AS $$
SELECT ARRAY[$1, $2];
$$ LANGUAGE SQL;
SELECT make_array2(1, 2.5) AS numericarray;
numericarray
--------------
{1,2.5}
(1 row)
由于通用类型解析的规则默认为在所有输入类型未知时选择类型 text,所以此方法也行之有效:
Because the rules for common type resolution default to choosing type text when all inputs are of unknown types, this also works:
SELECT make_array2('a', 'b') AS textarray;
textarray
-----------
{a,b}
(1 row)
允许具有固定返回类型的多态参数,但反之则不然。例如:
It is permitted to have polymorphic arguments with a fixed return type, but the converse is not. For example:
CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
SELECT $1 > $2;
$$ LANGUAGE SQL;
SELECT is_greater(1, 2);
is_greater
------------
f
(1 row)
CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
SELECT 1;
$$ LANGUAGE SQL;
ERROR: cannot determine result data type
DETAIL: A result of type anyelement requires at least one input of type anyelement, anyarray, anynonarray, anyenum, or anyrange.
多态性可以使用具有输出参数的函数。例如:
Polymorphism can be used with functions that have output arguments. For example:
CREATE FUNCTION dup (f1 anyelement, OUT f2 anyelement, OUT f3 anyarray)
AS 'select $1, array[$1,$1]' LANGUAGE SQL;
SELECT * FROM dup(22);
f2 | f3
----+---------
22 | {22,22}
(1 row)
多态性还可以使用可变参数函数。例如:
Polymorphism can also be used with variadic functions. For example:
CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
$$ LANGUAGE SQL;
SELECT anyleast(10, -1, 5, 4);
anyleast
----------
-1
(1 row)
SELECT anyleast('abc'::text, 'def');
anyleast
----------
abc
(1 row)
CREATE FUNCTION concat_values(text, VARIADIC anyarray) RETURNS text AS $$
SELECT array_to_string($2, $1);
$$ LANGUAGE SQL;
SELECT concat_values('|', 1, 4, 2);
concat_values
---------------
1|4|2
(1 row)
38.5.12. SQL Functions with Collations #
如果 SQL 函数具有一个或多个具有可整理数据类型的参数,则将根据分配给实际参数的整理规则为每个函数调用识别整理规则,如 Section 24.2 中所述。如果已成功识别整理规则(即,参数之间不存在隐式整理规则的冲突),那么所有可整理参数都将被视为隐式具有该整理规则。这将影响函数中与整理规则相关操作的行为。例如,使用上面描述的 anyleast 函数,结果为
When an SQL function has one or more parameters of collatable data types, a collation is identified for each function call depending on the collations assigned to the actual arguments, as described in Section 24.2. If a collation is successfully identified (i.e., there are no conflicts of implicit collations among the arguments) then all the collatable parameters are treated as having that collation implicitly. This will affect the behavior of collation-sensitive operations within the function. For example, using the anyleast function described above, the result of
SELECT anyleast('abc'::text, 'ABC');
将取决于数据库的默认排序规则。在 C 区域设置中,结果将为 ABC,但在许多其他区域设置中,结果将为 abc。可以通过向任何参数添加 COLLATE 子句来强制使用排序规则,例如
will depend on the database’s default collation. In C locale the result will be ABC, but in many other locales it will be abc. The collation to use can be forced by adding a COLLATE clause to any of the arguments, for example
SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
或者,如果您希望函数无论在何种情况下调用都能使用特定的排序规则,请根据需要在函数定义中插入 COLLATE 子句。此 anyleast 版本将始终使用 en_US 区域设置来比较字符串:
Alternatively, if you wish a function to operate with a particular collation regardless of what it is called with, insert COLLATE clauses as needed in the function definition. This version of anyleast would always use en_US locale to compare strings:
CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
$$ LANGUAGE SQL;
但请注意,如果应用于不可整理的数据类型,则会引发错误。
But note that this will throw an error if applied to a non-collatable data type.
如果实际参数之间无法识别出通用排序规则,则 SQL 函数将其参数视为具有其数据类型的默认排序规则(通常是数据库的默认排序规则,但对于域类型的参数可能不同)。
If no common collation can be identified among the actual arguments, then an SQL function treats its parameters as having their data types' default collation (which is usually the database’s default collation, but could be different for parameters of domain types).
可整理参数的行为可以被认为是多态性的一种有限形式,仅适用于文本数据类型。
The behavior of collatable parameters can be thought of as a limited form of polymorphism, applicable only to textual data types.