Postgresql 中文操作指南
8.16. Composite Types #
composite type 表示行或记录的结构;它本质上只是一份字段名称及其数据类型列表。PostgreSQL 允许复合类型以与简单类型相同的方式使用。例如,可以将表的列声明为复合类型。
A composite type represents the structure of a row or record; it is essentially just a list of field names and their data types. PostgreSQL allows composite types to be used in many of the same ways that simple types can be used. For example, a column of a table can be declared to be of a composite type.
8.16.1. Declaration of Composite Types #
以下是定义复合类型两个简单的示例:
Here are two simple examples of defining composite types:
CREATE TYPE complex AS (
r double precision,
i double precision
);
CREATE TYPE inventory_item AS (
name text,
supplier_id integer,
price numeric
);
语法与 CREATE TABLE 类似,只是只能指定字段名称和类型;目前不能包括约束(例如 NOT NULL)。请注意,AS 关键词必不可少;如果没有它,系统会认为这意味着不同类型的 CREATE TYPE 命令,并且您会遇到奇怪的语法错误。
The syntax is comparable to CREATE TABLE, except that only field names and types can be specified; no constraints (such as NOT NULL) can presently be included. Note that the AS keyword is essential; without it, the system will think a different kind of CREATE TYPE command is meant, and you will get odd syntax errors.
在定义类型后,我们可以使用它们来创建表:
Having defined the types, we can use them to create tables:
CREATE TABLE on_hand (
item inventory_item,
count integer
);
INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);
或函数:
or functions:
CREATE FUNCTION price_extension(inventory_item, integer) RETURNS numeric
AS 'SELECT $1.price * $2' LANGUAGE SQL;
SELECT price_extension(item, 10) FROM on_hand;
每当你创建表时,也将自动创建一个复合类型,与表的同名,用于表示表的行类型。例如,我们曾说过:
Whenever you create a table, a composite type is also automatically created, with the same name as the table, to represent the table’s row type. For example, had we said:
CREATE TABLE inventory_item (
name text,
supplier_id integer REFERENCES suppliers,
price numeric CHECK (price > 0)
);
那么,上面显示的 inventory_item 复合类型将作为副产品而生成,并且可以像上面一样使用。但是,请注意当前实现的一个重要限制:由于没有约束与复合类型相关联,因此表定义 do not apply 中显示的约束对表外部复合类型的取值无效。(要解决此问题,请创建复合类型上的 [role="bare"]glossary.html#GLOSSARY-DOMAIN domain ,并将所需的约束应用为该域的 CHECK 约束。)
then the same inventory_item composite type shown above would come into being as a byproduct, and could be used just as above. Note however an important restriction of the current implementation: since no constraints are associated with a composite type, the constraints shown in the table definition do not apply to values of the composite type outside the table. (To work around this, create a [role="bare"]glossary.html#GLOSSARY-DOMAINdomain over the composite type, and apply the desired constraints as CHECK constraints of the domain.)
8.16.2. Constructing Composite Values #
要将复合值写入为文字常量,请将字段值括在括号中并用逗号分隔。您可以在任何字段值周围加上双引号,如果它包含逗号或括号,则必须这样做。(更多详细信息请参见 below)。因此,复合常数的通用格式如下:
To write a composite value as a literal constant, enclose the field values within parentheses and separate them by commas. You can put double quotes around any field value, and must do so if it contains commas or parentheses. (More details appear below.) Thus, the general format of a composite constant is the following:
'( val1 , val2 , ... )'
一个示例是:
An example is:
'("fuzzy dice",42,1.99)'
这将是上面定义的 inventory_item 类型的有效值。要使字段为 NULL,请在其在列表中的位置不写任何字符。例如,此常量指定了一个 NULL 的第三字段:
which would be a valid value of the inventory_item type defined above. To make a field be NULL, write no characters at all in its position in the list. For example, this constant specifies a NULL third field:
'("fuzzy dice",42,)'
如果你想要一个空字符串而不是 NULL,请写入双引号:
If you want an empty string rather than NULL, write double quotes:
'("",42,)'
此处第一个字段是非 NULL 的空字符串,第三个为 NULL。
Here the first field is a non-NULL empty string, the third is NULL.
(这些常量实际上只是 Section 4.1.2.7 中讨论的泛型类型常量的一个特例。最初将该常量视为一个字符串,并将其传递给复合类型输入转换例程。明确的类型规范可能が必要,以告诉将常量转换为哪种类型。)
(These constants are actually only a special case of the generic type constants discussed in Section 4.1.2.7. The constant is initially treated as a string and passed to the composite-type input conversion routine. An explicit type specification might be necessary to tell which type to convert the constant to.)
也可以使用 ROW 表达式语法来构造复合值。在大多数情况下,这比字符串字面量语法简单得多,因为你无需担心多层引号。我们已经在上面使用了此方法:
The ROW expression syntax can also be used to construct composite values. In most cases this is considerably simpler to use than the string-literal syntax since you don’t have to worry about multiple layers of quoting. We already used this method above:
ROW('fuzzy dice', 42, 1.99)
ROW('', 42, NULL)
只要在表达式中有多个字段,ROW 关键字实际上是可选的,因此可以将其简化为:
The ROW keyword is actually optional as long as you have more than one field in the expression, so these can be simplified to:
('fuzzy dice', 42, 1.99)
('', 42, NULL)
ROW 表达式语法在 Section 4.2.13 中进行了更详细的讨论。
The ROW expression syntax is discussed in more detail in Section 4.2.13.
8.16.3. Accessing Composite Types #
要访问复合列的字段,请写一个点和字段名称,就像从表名中选择一个字段一样。事实上,它很像从表名中选择,所以你通常必须使用括号来避免混淆解析器。例如,你可以尝试通过如下方式从我们的 on_hand 示例表中选择一些子字段:
To access a field of a composite column, one writes a dot and the field name, much like selecting a field from a table name. In fact, it’s so much like selecting from a table name that you often have to use parentheses to keep from confusing the parser. For example, you might try to select some subfields from our on_hand example table with something like:
SELECT item.name FROM on_hand WHERE item.price > 9.99;
这将不起作用,因为根据 SQL 语法规则,名称 item 被视为表名,而不是 on_hand 的列名。你必须像这样写它:
This will not work since the name item is taken to be a table name, not a column name of on_hand, per SQL syntax rules. You must write it like this:
SELECT (item).name FROM on_hand WHERE (item).price > 9.99;
或者,如果你需要使用表名(例如,在多表查询中),请像这样写:
or if you need to use the table name as well (for instance in a multitable query), like this:
SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;
现在,将括号中的对象正确解释为对 item 列的引用,然后可以从中选择子字段。
Now the parenthesized object is correctly interpreted as a reference to the item column, and then the subfield can be selected from it.
从复合值中选择字段时,会应用类似的语法问题。例如,要仅从返回复合值的函数的结果中选择一个字段,你需要编写类似这样的内容:
Similar syntactic issues apply whenever you select a field from a composite value. For instance, to select just one field from the result of a function that returns a composite value, you’d need to write something like:
SELECT (my_func(...)).field FROM ...
如果没有额外的括号,这将产生语法错误。
Without the extra parentheses, this will generate a syntax error.
特殊字段名称 * 表示“所有字段”,如 Section 8.16.5 中所述。
The special field name * means “all fields”, as further explained in Section 8.16.5.
8.16.4. Modifying Composite Types #
以下是插入和更新复合列的正确语法的一些示例。首先,插入或更新整个列:
Here are some examples of the proper syntax for inserting and updating composite columns. First, inserting or updating a whole column:
INSERT INTO mytab (complex_col) VALUES((1.1,2.2));
UPDATE mytab SET complex_col = ROW(1.1,2.2) WHERE ...;
第一个示例省略了 ROW,第二个示例使用了它;我们两种方式都可以那样做。
The first example omits ROW, the second uses it; we could have done it either way.
我们可以更新复合列的各个子字段:
We can update an individual subfield of a composite column:
UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...;
请注意,我们不需要(实际上也不可以)将括号放在 SET 后面出现的列名周围,但在等号右侧的表达式中引用同一列时,我们需要括号。
Notice here that we don’t need to (and indeed cannot) put parentheses around the column name appearing just after SET, but we do need parentheses when referencing the same column in the expression to the right of the equal sign.
我们也可以将子字段指定为 INSERT 的目标:
And we can specify subfields as targets for INSERT, too:
INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
如果我们没有为列的所有子字段提供值,那么剩余的子字段将被填充为空值。
Had we not supplied values for all the subfields of the column, the remaining subfields would have been filled with null values.
8.16.5. Using Composite Types in Queries #
查询中与复合类型关联的特殊语法规则和行为。这些规则提供了有用的捷径,但如果你不知道它们背后的逻辑,可能会感到困惑。
There are various special syntax rules and behaviors associated with composite types in queries. These rules provide useful shortcuts, but can be confusing if you don’t know the logic behind them.
在 PostgreSQL 中,查询中对表名(或别名)的引用实际上是对表的当前行的复合值的引用。例如,如果我们有一个表 inventory_item,如 above 所示,我们可以写:
In PostgreSQL, a reference to a table name (or alias) in a query is effectively a reference to the composite value of the table’s current row. For example, if we had a table inventory_item as shown above, we could write:
SELECT c FROM inventory_item c;
此查询生成单个复合值列,因此我们可能会得到如下输出:
This query produces a single composite-valued column, so we might get output like:
c
------------------------
("fuzzy dice",42,1.99)
(1 row)
但请注意,简单名称在表名前与列名匹配,因此此示例仅适用于查询的表中没有名为 c 的列。
Note however that simple names are matched to column names before table names, so this example works only because there is no column named c in the query’s tables.
普通的限定列名语法 table_name.column_name 可以理解为将 field selection 应用于表当前行的复合值。(出于效率的原因,实际上并未以这种方式实施。)
The ordinary qualified-column-name syntax table_name.column_name can be understood as applying field selection to the composite value of the table’s current row. (For efficiency reasons, it’s not actually implemented that way.)
当我们编写
When we write
SELECT c.* FROM inventory_item c;
然后,根据 SQL 标准,我们应该将表的内容展开为单独的列:
then, according to the SQL standard, we should get the contents of the table expanded into separate columns:
name | supplier_id | price
------------+-------------+-------
fuzzy dice | 42 | 1.99
(1 row)
就像查询是
as if the query were
SELECT c.name, c.supplier_id, c.price FROM inventory_item c;
PostgreSQL 将这种扩展行为应用于任何复合值表达式,尽管如 above 所示,你需要在该表达式周围加上括号,该值应用于 .*,无论何时它不是一个简单的表名。例如,如果 myfunc() 是一个返回复合类型的函数,其中包含列 a、b、和 c,那么这些两个查询具有相同的结果:
PostgreSQL will apply this expansion behavior to any composite-valued expression, although as shown above, you need to write parentheses around the value that .* is applied to whenever it’s not a simple table name. For example, if myfunc() is a function returning a composite type with columns a, b, and c, then these two queries have the same result:
SELECT (myfunc(x)).* FROM some_table;
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
Tip
PostgreSQL 通过实际将第一种形式转换为第二种形式来处理列展开。因此,在此示例中,myfunc() 使用任何一种语法都会被每行调用三次。如果这是一个昂贵的函数,你可能希望避免这种情况,你可以使用像这样的查询:
PostgreSQL handles column expansion by actually transforming the first form into the second. So, in this example, myfunc() would get invoked three times per row with either syntax. If it’s an expensive function you may wish to avoid that, which you can do with a query like:
SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;
将函数放在 LATERAL FROM 项中可以防止它每行被调用多次。m.* 仍然展开为 m.a, m.b, m.c,但现在这些变量只是 FROM 项的输出的引用。(这里 LATERAL 关键字是可选的,但我们显示它是为了澄清该函数从 some_table 获取了 x。)
Placing the function in a LATERAL FROM item keeps it from being invoked more than once per row. m.* is still expanded into m.a, m.b, m.c, but now those variables are just references to the output of the FROM item. (The LATERAL keyword is optional here, but we show it to clarify that the function is getting x from some_table.)
当 composite_value_ .*_ 语法出现在 SELECT output list 、 INSERT / UPDATE / DELETE 中的 RETURNING list 、 VALUES clause 或 row constructor 的顶层时,会导致这种类型的列展开。在所有其他情况下(包括嵌套在其中一种结构的内部时),将 .* 附加到复合值不会更改该值,因为这意味着“所有列”,因此再次生成相同的复合值。例如,如果 somefunc() 接受复合值参数,则以下查询相同:
The composite_value_.*_ syntax results in column expansion of this kind when it appears at the top level of a SELECT output list, a RETURNING list in INSERT/UPDATE/DELETE, a VALUES clause, or a row constructor. In all other contexts (including when nested inside one of those constructs), attaching .* to a composite value does not change the value, since it means “all columns” and so the same composite value is produced again. For example, if somefunc() accepts a composite-valued argument, these queries are the same:
SELECT somefunc(c.*) FROM inventory_item c;
SELECT somefunc(c) FROM inventory_item c;
在两种情况下,当前的 inventory_item 行都会被作为单一复合型参数传递至该函数。即使 .* 在此类情况下无用,但使用它是一种良好的风格,因为它明确表示了一个复合型值。特别是,解析器会认为 c.* 中的 c 指的是一个表名或别名,而不是一个列名,因此没有歧义;如果没有 .*,则不清楚 c 是指表名还是列名,事实上,如果有一个名为 c 的列,则会优先选择列名解析。
In both cases, the current row of inventory_item is passed to the function as a single composite-valued argument. Even though .* does nothing in such cases, using it is good style, since it makes clear that a composite value is intended. In particular, the parser will consider c in c.* to refer to a table name or alias, not to a column name, so that there is no ambiguity; whereas without .*, it is not clear whether c means a table name or a column name, and in fact the column-name interpretation will be preferred if there is a column named c.
另一个展示这些概念的示例是,所有这些查询都具有相同含义:
Another example demonstrating these concepts is that all these queries mean the same thing:
SELECT * FROM inventory_item c ORDER BY c;
SELECT * FROM inventory_item c ORDER BY c.*;
SELECT * FROM inventory_item c ORDER BY ROW(c.*);
所有这些 ORDER BY 子句指定行复合值,最终根据 Section 9.24.6 中所述的规则对行进行排序。但是,如果 inventory_item 包含一个名为 c 的列,则第一种情况会与其他情况不同,因为它表示仅按该列进行排序。给定先前显示的列名,这些查询也等效于上述查询:
All of these ORDER BY clauses specify the row’s composite value, resulting in sorting the rows according to the rules described in Section 9.24.6. However, if inventory_item contained a column named c, the first case would be different from the others, as it would mean to sort by that column only. Given the column names previously shown, these queries are also equivalent to those above:
SELECT * FROM inventory_item c ORDER BY ROW(c.name, c.supplier_id, c.price);
SELECT * FROM inventory_item c ORDER BY (c.name, c.supplier_id, c.price);
(最后一个情况使用了一个省略了关键字 ROW 的行构造器。)
(The last case uses a row constructor with the key word ROW omitted.)
与复合值关联的另一个特殊语法行为是我们可以使用 functional notation 提取复合值的一个字段。解释此方法的简单方法是 field(table)_ and _ table_. field_ 的表示法是可互换的。例如,以下查询等效:
Another special syntactical behavior associated with composite values is that we can use functional notation for extracting a field of a composite value. The simple way to explain this is that the notations field_(table)_ and table_.field_ are interchangeable. For example, these queries are equivalent:
SELECT c.name FROM inventory_item c WHERE c.price > 1000;
SELECT name(c) FROM inventory_item c WHERE price(c) > 1000;
此外,如果我们有一个接受一个复合类型单一参数的函数,则可以使用这两种符号中的任何一种调用它。这些查询都是等效的:
Moreover, if we have a function that accepts a single argument of a composite type, we can call it with either notation. These queries are all equivalent:
SELECT somefunc(c) FROM inventory_item c;
SELECT somefunc(c.*) FROM inventory_item c;
SELECT c.somefunc FROM inventory_item c;
函数符号和字段符号之间的这种等效性使得可以在复合类型上使用函数来实现“计算字段”。使用以上最后一个查询的应用程序无需直接意识到 somefunc 不是该表的真实列。
This equivalence between functional notation and field notation makes it possible to use functions on composite types to implement “computed fields”. An application using the last query above wouldn’t need to be directly aware that somefunc isn’t a real column of the table.
Tip
由于此行为,不建议向采用单个复合类型参数的函数赋予与该复合类型的任何字段相同的名称。如果存在歧义,则在使用字段名语法时将选择字段名解释,而在使用函数调用语法时将选择函数。但是,在 PostgreSQL 11 之前的版本中,总是选择字段名解释,除非调用语法要求它成为函数调用。在旧版本中强制执行函数解释的一种方法是对函数名称进行模式限定,即写 _schema . func ( compositevalue )_。
Because of this behavior, it’s unwise to give a function that takes a single composite-type argument the same name as any of the fields of that composite type. If there is ambiguity, the field-name interpretation will be chosen if field-name syntax is used, while the function will be chosen if function-call syntax is used. However, PostgreSQL versions before 11 always chose the field-name interpretation, unless the syntax of the call required it to be a function call. One way to force the function interpretation in older versions is to schema-qualify the function name, that is, write _schema.func(compositevalue)_.
8.16.6. Composite Type Input and Output Syntax #
复合值的外部文本表示形式由根据各个字段类型 I/O 转换规则解释的项目组成,加上指示复合结构的装饰。装饰包括圆括号 (( 和 )),以及相邻项目之间的逗号 (,)。圆括号外的空格会被忽略,但在括号内它会被视为字段值的一部分,并且根据字段数据类型的输入转换规则,它可能会有或没有意义。例如,在:
The external text representation of a composite value consists of items that are interpreted according to the I/O conversion rules for the individual field types, plus decoration that indicates the composite structure. The decoration consists of parentheses (( and )) around the whole value, plus commas (,) between adjacent items. Whitespace outside the parentheses is ignored, but within the parentheses it is considered part of the field value, and might or might not be significant depending on the input conversion rules for the field data type. For example, in:
'( 42)'
如果字段类型为整数,则空格将被忽略,但如果字段类型为文本,则不会被忽略。
the whitespace will be ignored if the field type is integer, but not if it is text.
如前所述,在编写复合值时,可以在任何单个字段值周围加上双引号。如果你 must 这样做,字段值可能会混淆复合值解析器。特别是,包含圆括号、逗号、双引号或反斜杠的字段必须用双引号引起来。要在引用的复合字段值中放置双引号或反斜杠,请在其前面加上一个反斜杠。(此外,双引号字段值中的双引号对被视为表示一个双引号字符,类似于 SQL 文本字符串中单引号的规则。)或者,你可以避免使用引号并使用反斜杠转义来保护所有否则会被视为复合语法的字符。
As shown previously, when writing a composite value you can write double quotes around any individual field value. You must do so if the field value would otherwise confuse the composite-value parser. In particular, fields containing parentheses, commas, double quotes, or backslashes must be double-quoted. To put a double quote or backslash in a quoted composite field value, precede it with a backslash. (Also, a pair of double quotes within a double-quoted field value is taken to represent a double quote character, analogously to the rules for single quotes in SQL literal strings.) Alternatively, you can avoid quoting and use backslash-escaping to protect all data characters that would otherwise be taken as composite syntax.
完全空值(逗号或圆括号之间没有任何字符)表示 NULL。要编写一个值(它是一个空字符串而不是 NULL),请编写 ""。
A completely empty field value (no characters at all between the commas or parentheses) represents a NULL. To write a value that is an empty string rather than NULL, write "".
如果复合输出例程是空字符串或包含圆括号、逗号、双引号、反斜杠或空格,它将用双引号将字段值引起来。(对空格进行此类处理不是必要的,但有助于提高可读性。)嵌入在字段值中的双引号和反斜杠将被加倍。
The composite output routine will put double quotes around field values if they are empty strings or contain parentheses, commas, double quotes, backslashes, or white space. (Doing so for white space is not essential, but aids legibility.) Double quotes and backslashes embedded in field values will be doubled.
Note
请记住,SQL 命令中编写的内容将首先解释为字符串文字,然后解释为复合体。这会使所需的反斜杠数量翻倍(假定使用了转义字符串语法)。例如,要在复合值中插入一个包含双引号和反斜杠的 text 字段,则需要编写:
Remember that what you write in an SQL command will first be interpreted as a string literal, and then as a composite. This doubles the number of backslashes you need (assuming escape string syntax is used). For example, to insert a text field containing a double quote and a backslash in a composite value, you’d need to write:
INSERT ... VALUES ('("\"\\")');
字符串字面量处理器移除一级反斜杠,以便到达复合值解析器的内容看起来像 ("\"\\")。反过来,馈送至 text 数据类型输入例程的字符串变为 "\。(如果我们正在处理一个输入例程也专门处理反斜杠的数据类型,例如 bytea,则我们可能需要八个反斜杠在命令中输入一个反斜杠进入存储的复合字段。)可以使用 Dollar 引用(请参阅 Section 4.1.2.4)来避免需要加倍反斜杠。
The string-literal processor removes one level of backslashes, so that what arrives at the composite-value parser looks like ("\"\\"). In turn, the string fed to the text data type’s input routine becomes "\. (If we were working with a data type whose input routine also treated backslashes specially, bytea for example, we might need as many as eight backslashes in the command to get one backslash into the stored composite field.) Dollar quoting (see Section 4.1.2.4) can be used to avoid the need to double backslashes.
Tip
在 SQL 命令中编写复合值时,ROW 构造器语法通常比复合文字语法更容易处理。在 ROW 中,单个字段值以非复合成员方式编写时的相同方式编写。
The ROW constructor syntax is usually easier to work with than the composite-literal syntax when writing composite values in SQL commands. In ROW, individual field values are written the same way they would be written when not members of a composite.