Postgresql 中文操作指南

F.38. postgres_fdw — access data stored in external PostgreSQL servers #

postgres_fdw 模块提供外部数据封装 postgres_fdw,它可以用于访问存储在外部 PostgreSQL 服务器中的数据。

The postgres_fdw module provides the foreign-data wrapper postgres_fdw, which can be used to access data stored in external PostgreSQL servers.

此模块提供 的功能与旧的 dblink模块的功能有很大重叠。但是,_postgres_fdw_为访问远程表格提供了更透明和符合标准的语法,在很多情况下能提供更好的性能。

The functionality provided by this module overlaps substantially with the functionality of the older dblink module. But postgres_fdw provides more transparent and standards-compliant syntax for accessing remote tables, and can give better performance in many cases.

为使用 postgres_fdw 准备远程访问:

To prepare for remote access using postgres_fdw:

现在,您只需 SELECT 从外部表即可访问存储在它下层的远程表中的数据。您还可以使用 INSERTUPDATEDELETECOPYTRUNCATE 修改远程表。(当然,您在用户映射中指定的远程用户必须具有执行这些操作的特权。)

Now you need only SELECT from a foreign table to access the data stored in its underlying remote table. You can also modify the remote table using INSERT, UPDATE, DELETE, COPY, or TRUNCATE. (Of course, the remote user you have specified in your user mapping must have privileges to do these things.)

请注意,在访问或修改远程表时,SELECTUPDATEDELETETRUNCATE 中指定的 ONLY 选项无效。

Note that the ONLY option specified in SELECT, UPDATE, DELETE or TRUNCATE has no effect when accessing or modifying the remote table.

请注意,postgres_fdw 当前不支持带 ON CONFLICT DO UPDATE 子句的 INSERT 语句。但是,ON CONFLICT DO NOTHING 子句受支持,只要省略唯一的索引推断规范。另请注意,postgres_fdw 支持由对分区表执行的 UPDATE 语句调用的行移动,但它当前不处理在同一命令中远程分区被选择用于将移动的行插入其中而该分区又将作为 UPDATE 目标分区而在其他地方被更新的情况。

Note that postgres_fdw currently lacks support for INSERT statements with an ON CONFLICT DO UPDATE clause. However, the ON CONFLICT DO NOTHING clause is supported, provided a unique index inference specification is omitted. Note also that postgres_fdw supports row movement invoked by UPDATE statements executed on partitioned tables, but it currently does not handle the case where a remote partition chosen to insert a moved row into is also an UPDATE target partition that will be updated elsewhere in the same command.

通常建议,根据所引用的远程表的列来使用完全相同的数据类型(如适用,也包括校对)来声明外表的列。虽然 postgres_fdw 在按需执行数据类型转换方面当前相当宽容,但由于远程服务器对查询条件的解释与本地服务器不同,当类型或校对不匹配时可能会出现令人惊讶的语义异常。

It is generally recommended that the columns of a foreign table be declared with exactly the same data types, and collations if applicable, as the referenced columns of the remote table. Although postgres_fdw is currently rather forgiving about performing data type conversions at need, surprising semantic anomalies may arise when types or collations do not match, due to the remote server interpreting query conditions differently from the local server.

请注意,与底层远程表相比,外表的列可以更少,或列顺序不同。与远程表的列匹配是根据名称,而不是位置。

Note that a foreign table can be declared with fewer columns, or with a different column order, than its underlying remote table has. Matching of columns to the remote table is by name, not position.

F.38.1. FDW Options of postgres_fdw #

F.38.1.1. Connection Options #

使用 _postgres_fdw_外部数据包装器的外部服务器可以与 libpq 接受的连接字符串中的选项相同,如 Section 34.1.2中所述,但这些选项不被允许或有特殊处理:

A foreign server using the postgres_fdw foreign data wrapper can have the same options that libpq accepts in connection strings, as described in Section 34.1.2, except that these options are not allowed or have special handling:

只有超级用户才能使用 sslcertsslkey 设置创建或修改用户映射。

Only superusers may create or modify user mappings with the sslcert or sslkey settings.

非超级用户可以使用密码身份验证或 GSSAPI 委派凭证连接到外来服务器,因此请为非超级用户的用户映射指定 password 选项,其中需要密码身份验证。

Non-superusers may connect to foreign servers using password authentication or with GSSAPI delegated credentials, so specify the password option for user mappings belonging to non-superusers where password authentication is required.

超级用户可以通过设置用户映射选项 password_required 'false',例如,逐个用户映射基础覆盖此检查。

A superuser may override this check on a per-user-mapping basis by setting the user mapping option password_required 'false', e.g.,

ALTER USER MAPPING FOR some_non_superuser SERVER loopback_nopw
OPTIONS (ADD password_required 'false');

为了防止无特权用户利用 postgres 服务器运行的 Unix 用户的身份验证权限升级到超级用户权限,只有超级用户才能对用户映射设置此选项。

To prevent unprivileged users from exploiting the authentication rights of the unix user the postgres server is running as to escalate to superuser rights, only the superuser may set this option on a user mapping.

需要注意以确保这种情况不会允许映射的用户连接为超级用户映射数据库,如 CVE-2007-3278 和 CVE-2007-6601 所述。请勿对 public 角色设置 password_required=false。请记住,映射的用户可能能够使用 unix 主目录中 postgres 服务器运行的系统用户中的任何客户端证书、.pgpass.pg_service.conf 等。他们还可以使用诸如 peerident 身份验证等身份验证模式授予的任何信任关系。

Care is required to ensure that this does not allow the mapped user the ability to connect as superuser to the mapped database per CVE-2007-3278 and CVE-2007-6601. Don’t set password_required=false on the public role. Keep in mind that the mapped user can potentially use any client certificates, .pgpass, .pg_service.conf etc. in the unix home directory of the system user the postgres server runs as. They can also use any trust relationship granted by authentication modes like peer or ident authentication.

F.38.1.2. Object Name Options #

这些选项可用于控制发送到远程 PostgreSQL 服务器的 SQL 语句中使用的名称。当以不同于底层远程表的名称创建外表时,需要这些选项。

These options can be used to control the names used in SQL statements sent to the remote PostgreSQL server. These options are needed when a foreign table is created with names different from the underlying remote table’s names.

  • schema_name (string)

    • This option, which can be specified for a foreign table, gives the schema name to use for the foreign table on the remote server. If this option is omitted, the name of the foreign table’s schema is used.

  • table_name (string)

    • This option, which can be specified for a foreign table, gives the table name to use for the foreign table on the remote server. If this option is omitted, the foreign table’s name is used.

  • column_name (string)

    • This option, which can be specified for a column of a foreign table, gives the column name to use for the column on the remote server. If this option is omitted, the column’s name is used.

F.38.1.3. Cost Estimation Options #

postgres_fdw 通过对远程服务器执行查询以检索远程数据,因此理想情况下,扫描外表估计的成本应该是远程服务器执行所需的所有成本外加某些通信开销。获取此类估算的最可靠方法是询问远程服务器,然后为开销添加某些内容 — 但对于简单查询,可能不值得进行额外的远程查询以获取成本估算。因此 postgres_fdw 提供以下选项以控制如何执行成本估算:

postgres_fdw retrieves remote data by executing queries against remote servers, so ideally the estimated cost of scanning a foreign table should be whatever it costs to be done on the remote server, plus some overhead for communication. The most reliable way to get such an estimate is to ask the remote server and then add something for overhead — but for simple queries, it may not be worth the cost of an additional remote query to get a cost estimate. So postgres_fdw provides the following options to control how cost estimation is done:

  • use_remote_estimate (boolean)

    • This option, which can be specified for a foreign table or a foreign server, controls whether postgres_fdw issues remote EXPLAIN commands to obtain cost estimates. A setting for a foreign table overrides any setting for its server, but only for that table. The default is false.

  • fdw_startup_cost (floating point)

    • This option, which can be specified for a foreign server, is a floating point value that is added to the estimated startup cost of any foreign-table scan on that server. This represents the additional overhead of establishing a connection, parsing and planning the query on the remote side, etc. The default value is 100.

  • fdw_tuple_cost (floating point)

    • This option, which can be specified for a foreign server, is a floating point value that is used as extra cost per-tuple for foreign-table scans on that server. This represents the additional overhead of data transfer between servers. You might increase or decrease this number to reflect higher or lower network delay to the remote server. The default value is 0.01.

use_remote_estimate 为 true 时, postgres_fdw 从远程服务器获取行计数和成本估算,然后再将 fdw_startup_costfdw_tuple_cost 添加到成本估算中。当 use_remote_estimate 为否时, postgres_fdw 执行本地行计数和成本估算,然后再将 fdw_startup_costfdw_tuple_cost 添加到成本估算中。除非有远程表的统计信息本地副本,否则本地估算不太可能非常精确。在外部表上运行 ANALYZE 是更新本地统计数据的方式;此操作会扫描远程表,然后计算并存储统计信息,就好像该表是本地表一样。保留本地统计数据是减少远程表的每次查询规划开销的有用方式——但如果经常更新远程表,则本地统计数据很快就会过时。

When use_remote_estimate is true, postgres_fdw obtains row count and cost estimates from the remote server and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. When use_remote_estimate is false, postgres_fdw performs local row count and cost estimation and then adds fdw_startup_cost and fdw_tuple_cost to the cost estimates. This local estimation is unlikely to be very accurate unless local copies of the remote table’s statistics are available. Running ANALYZE on the foreign table is the way to update the local statistics; this will perform a scan of the remote table and then calculate and store statistics just as though the table were local. Keeping local statistics can be a useful way to reduce per-query planning overhead for a remote table — but if the remote table is frequently updated, the local statistics will soon be obsolete.

以下选项控制此类 ANALYZE 操作的行为:

The following option controls how such an ANALYZE operation behaves:

  • analyze_sampling (string)

    • This option, which can be specified for a foreign table or a foreign server, determines if ANALYZE on a foreign table samples the data on the remote side, or reads and transfers all data and performs the sampling locally. The supported values are off, random, system, bernoulli and auto. off disables remote sampling, so all data are transferred and sampled locally. random performs remote sampling using the random() function to choose returned rows, while system and bernoulli rely on the built-in TABLESAMPLE methods of those names. random works on all remote server versions, while TABLESAMPLE is supported only since 9.5. auto (the default) picks the recommended sampling method automatically; currently it means either bernoulli or random depending on the remote server version.

F.38.1.4. Remote Execution Options #

默认情况下,只考虑使用内置运算符和函数的 WHERE 子句在远程服务器上执行。涉及非内置函数的子句在获取行后在本地进行检查。如果远程服务器上提供了这些函数并且可以依靠它们生成与在本地相同的结果,那么发送这种 WHERE 子句以远程执行可以提高性能。此行为可以使用以下选项进行控制:

By default, only WHERE clauses using built-in operators and functions will be considered for execution on the remote server. Clauses involving non-built-in functions are checked locally after rows are fetched. If such functions are available on the remote server and can be relied on to produce the same results as they do locally, performance can be improved by sending such WHERE clauses for remote execution. This behavior can be controlled using the following option:

  • extensions (string)

    • This option is a comma-separated list of names of PostgreSQL extensions that are installed, in compatible versions, on both the local and remote servers. Functions and operators that are immutable and belong to a listed extension will be considered shippable to the remote server. This option can only be specified for foreign servers, not per-table.

    • When using the extensions option, it is the user’s responsibility that the listed extensions exist and behave identically on both the local and remote servers. Otherwise, remote queries may fail or behave unexpectedly.

  • fetch_size (integer)

    • This option specifies the number of rows postgres_fdw should get in each fetch operation. It can be specified for a foreign table or a foreign server. The option specified on a table overrides an option specified for the server. The default is 100.

  • batch_size (integer)

    • This option specifies the number of rows postgres_fdw should insert in each insert operation. It can be specified for a foreign table or a foreign server. The option specified on a table overrides an option specified for the server. The default is 1.

    • Note the actual number of rows postgres_fdw inserts at once depends on the number of columns and the provided batch_size value. The batch is executed as a single query, and the libpq protocol (which postgres_fdw uses to connect to a remote server) limits the number of parameters in a single query to 65535. When the number of columns * batch_size exceeds the limit, the batch_size will be adjusted to avoid an error.

    • This option also applies when copying into foreign tables. In that case the actual number of rows postgres_fdw copies at once is determined in a similar way to the insert case, but it is limited to at most 1000 due to implementation restrictions of the COPY command.

F.38.1.5. Asynchronous Execution Options #

postgres_fdw 支持异步执行,它并发运行 Append 节点的多个部分,而不是串行运行以提高性能。此执行可以使用以下选项进行控制:

postgres_fdw supports asynchronous execution, which runs multiple parts of an Append node concurrently rather than serially to improve performance. This execution can be controlled using the following option:

  • async_capable (boolean)

    • This option controls whether postgres_fdw allows foreign tables to be scanned concurrently for asynchronous execution. It can be specified for a foreign table or a foreign server. A table-level option overrides a server-level option. The default is false.

    • In order to ensure that the data being returned from a foreign server is consistent, postgres_fdw will only open one connection for a given foreign server and will run all queries against that server sequentially even if there are multiple foreign tables involved, unless those tables are subject to different user mappings. In such a case, it may be more performant to disable this option to eliminate the overhead associated with running queries asynchronously.

    • Asynchronous execution is applied even when an Append node contains subplan(s) executed synchronously as well as subplan(s) executed asynchronously. In such a case, if the asynchronous subplans are ones processed using postgres_fdw, tuples from the asynchronous subplans are not returned until after at least one synchronous subplan returns all tuples, as that subplan is executed while the asynchronous subplans are waiting for the results of asynchronous queries sent to foreign servers. This behavior might change in a future release.

F.38.1.6. Transaction Management Options #

如事务管理部分中所述,在 postgres_fdw 中,事务通过创建相应的远程事务来管理,而子事务通过创建相应的远程子事务来管理。当涉及到多个远程事务时,默认 postgres_fdw 在提交或终止本地事务时按顺序提交或终止这些远程事务。当涉及到多个远程子事务时,默认 postgres_fdw 在提交或终止本地子事务时按顺序提交或终止这些远程子事务。可以使用以下选项提高性能:

As described in the Transaction Management section, in postgres_fdw transactions are managed by creating corresponding remote transactions, and subtransactions are managed by creating corresponding remote subtransactions. When multiple remote transactions are involved in the current local transaction, by default postgres_fdw commits or aborts those remote transactions serially when the local transaction is committed or aborted. When multiple remote subtransactions are involved in the current local subtransaction, by default postgres_fdw commits or aborts those remote subtransactions serially when the local subtransaction is committed or aborted. Performance can be improved with the following options:

  • parallel_commit (boolean)

    • This option controls whether postgres_fdw commits, in parallel, remote transactions opened on a foreign server in a local transaction when the local transaction is committed. This setting also applies to remote and local subtransactions. This option can only be specified for foreign servers, not per-table. The default is false.

  • parallel_abort (boolean)

    • This option controls whether postgres_fdw aborts, in parallel, remote transactions opened on a foreign server in a local transaction when the local transaction is aborted. This setting also applies to remote and local subtransactions. This option can only be specified for foreign servers, not per-table. The default is false.

如果涉及到启用此类选项的多个外部服务器于本地事务中,那么当提交或终止本地事务时,这些外部服务器上多个远程事务将并行在这些外部服务器上提交或终止。

If multiple foreign servers with these options enabled are involved in a local transaction, multiple remote transactions on those foreign servers are committed or aborted in parallel across those foreign servers when the local transaction is committed or aborted.

启用这些选项时,本地事务提交或终止时,带有大量远程事务的外部服务器可能会看到负面性能影响。

When these options are enabled, a foreign server with many remote transactions may see a negative performance impact when the local transaction is committed or aborted.

F.38.1.7. Updatability Options #

默认情况下,所有使用 postgres_fdw 的外部服务器都被视为可更新的。可以使用以下选项覆盖此设置:

By default all foreign tables using postgres_fdw are assumed to be updatable. This may be overridden using the following option:

  • updatable (boolean)

    • This option controls whether postgres_fdw allows foreign tables to be modified using INSERT, UPDATE and DELETE commands. It can be specified for a foreign table or a foreign server. A table-level option overrides a server-level option. The default is true.

    • Of course, if the remote table is not in fact updatable, an error would occur anyway. Use of this option primarily allows the error to be thrown locally without querying the remote server. Note however that the information_schema views will report a postgres_fdw foreign table to be updatable (or not) according to the setting of this option, without any check of the remote server.

F.38.1.8. Truncatability Options #

默认情况下,使用 postgres_fdw 的所有外键表都假定为可截断的。可以使用以下选项覆盖此操作:

By default all foreign tables using postgres_fdw are assumed to be truncatable. This may be overridden using the following option:

  • truncatable (boolean)

    • This option controls whether postgres_fdw allows foreign tables to be truncated using the TRUNCATE command. It can be specified for a foreign table or a foreign server. A table-level option overrides a server-level option. The default is true.

    • Of course, if the remote table is not in fact truncatable, an error would occur anyway. Use of this option primarily allows the error to be thrown locally without querying the remote server.

F.38.1.9. Importing Options #

postgres_fdw 能够使用 IMPORT FOREIGN SCHEMA 导入外部表定义。此命令在本地服务器上创建外部表定义,该定义与远程服务器上存在的表或视图相匹配。如果要导入的远程表有用户自定义数据类型的列,那么本地服务器必须有相同名称的兼容类型。

postgres_fdw is able to import foreign table definitions using IMPORT FOREIGN SCHEMA. This command creates foreign table definitions on the local server that match tables or views present on the remote server. If the remote tables to be imported have columns of user-defined data types, the local server must have compatible types of the same names.

可以使用以下选项(在 IMPORT FOREIGN SCHEMA 命令中给出)自定义导入行为:

Importing behavior can be customized with the following options (given in the IMPORT FOREIGN SCHEMA command):

  • import_collate (boolean)

    • This option controls whether column COLLATE options are included in the definitions of foreign tables imported from a foreign server. The default is true. You might need to turn this off if the remote server has a different set of collation names than the local server does, which is likely to be the case if it’s running on a different operating system. If you do so, however, there is a very severe risk that the imported table columns' collations will not match the underlying data, resulting in anomalous query behavior.

    • Even when this parameter is set to true, importing columns whose collation is the remote server’s default can be risky. They will be imported with COLLATE "default", which will select the local server’s default collation, which could be different.

  • import_default (boolean)

    • This option controls whether column DEFAULT expressions are included in the definitions of foreign tables imported from a foreign server. The default is false. If you enable this option, be wary of defaults that might get computed differently on the local server than they would be on the remote server; nextval() is a common source of problems. The IMPORT will fail altogether if an imported default expression uses a function or operator that does not exist locally.

  • import_generated (boolean)

    • This option controls whether column GENERATED expressions are included in the definitions of foreign tables imported from a foreign server. The default is true. The IMPORT will fail altogether if an imported generated expression uses a function or operator that does not exist locally.

  • import_not_null (boolean)

    • This option controls whether column NOT NULL constraints are included in the definitions of foreign tables imported from a foreign server. The default is true.

请注意,除了 NOT NULL 以外的约束绝不会从远程表中导入。尽管 PostgreSQL 确实支持外部表上的检查约束,但没有规定自动导入这些约束,因为存在约束表达式在本地和远程服务器上评估不同的风险。检查约束的行为中的任何此类不一致性都可能导致查询优化中难以检测到的错误。因此,如果你希望导入检查约束,则必须手动执行此操作,并且应仔细验证每个约束的语义。有关外部表检查约束处理的更多详细信息,请参阅 CREATE FOREIGN TABLE

Note that constraints other than NOT NULL will never be imported from the remote tables. Although PostgreSQL does support check constraints on foreign tables, there is no provision for importing them automatically, because of the risk that a constraint expression could evaluate differently on the local and remote servers. Any such inconsistency in the behavior of a check constraint could lead to hard-to-detect errors in query optimization. So if you wish to import check constraints, you must do so manually, and you should verify the semantics of each one carefully. For more detail about the treatment of check constraints on foreign tables, see CREATE FOREIGN TABLE.

只有当在 LIMIT TO 从句中明确指定表或外部表时,才会导入作为某个其他表的分区表或外部表。否则,它们会自动从 IMPORT FOREIGN SCHEMA 中排除。由于可以通过作为分区等级根的分区表访问所有数据,因此只导入分区表应允许访问所有数据而不创建额外对象。

Tables or foreign tables which are partitions of some other table are imported only when they are explicitly specified in LIMIT TO clause. Otherwise they are automatically excluded from IMPORT FOREIGN SCHEMA. Since all data can be accessed through the partitioned table which is the root of the partitioning hierarchy, importing only partitioned tables should allow access to all the data without creating extra objects.

F.38.1.10. Connection Management Options #

默认情况下,postgres_fdw 建立到外键服务器的所有连接都将保持在本地会话中打开以供重新使用。

By default, all connections that postgres_fdw establishes to foreign servers are kept open in the local session for re-use.

  • keep_connections (boolean)

    • This option controls whether postgres_fdw keeps the connections to the foreign server open so that subsequent queries can re-use them. It can only be specified for a foreign server. The default is on. If set to off, all connections to this foreign server will be discarded at the end of each transaction.

F.38.2. Functions #

  • postgres_fdw_get_connections(OUT server_name text, OUT valid boolean) returns setof record

    • This function returns the foreign server names of all the open connections that postgres_fdw established from the local session to the foreign servers. It also returns whether each connection is valid or not. false is returned if the foreign server connection is used in the current local transaction but its foreign server or user mapping is changed or dropped (Note that server name of an invalid connection will be NULL if the server is dropped), and then such invalid connection will be closed at the end of that transaction. true is returned otherwise. If there are no open connections, no record is returned. Example usage of the function:

postgres=# SELECT * FROM postgres_fdw_get_connections() ORDER BY 1;
 server_name | valid
-------------+-------
 loopback1   | t
 loopback2   | f
  • postgres_fdw_disconnect(server_name text) returns boolean

    • This function discards the open connections that are established by postgres_fdw from the local session to the foreign server with the given name. Note that there can be multiple connections to the given server using different user mappings. If the connections are used in the current local transaction, they are not disconnected and warning messages are reported. This function returns true if it disconnects at least one connection, otherwise false. If no foreign server with the given name is found, an error is reported. Example usage of the function:

postgres=# SELECT postgres_fdw_disconnect('loopback1');
 postgres_fdw_disconnect
-------------------------
 t
  • postgres_fdw_disconnect_all() returns boolean

    • This function discards all the open connections that are established by postgres_fdw from the local session to foreign servers. If the connections are used in the current local transaction, they are not disconnected and warning messages are reported. This function returns true if it disconnects at least one connection, otherwise false. Example usage of the function:

postgres=# SELECT postgres_fdw_disconnect_all();
 postgres_fdw_disconnect_all
-----------------------------
 t

F.38.3. Connection Management #

postgres_fdw 在使用与外键服务器关联的外键表的第一个查询期间建立与外键服务器的连接。默认情况下,此连接被保留并在同一会话中后续查询中重新使用。可以使用 keep_connections 选项来控制外键服务器的此行为。如果使用多个用户标识(用户映射)来访问外键服务器,则为每个用户映射建立一个连接。

postgres_fdw establishes a connection to a foreign server during the first query that uses a foreign table associated with the foreign server. By default this connection is kept and re-used for subsequent queries in the same session. This behavior can be controlled using keep_connections option for a foreign server. If multiple user identities (user mappings) are used to access the foreign server, a connection is established for each user mapping.

更改定义或删除外部服务器或用户映射时,关闭相关连接。但请注意,如果在当前本地事务中使用任何连接,将在事务结束前保留它们。将来使用外部表的查询需要时,将重新建立已关闭的连接。

When changing the definition of or removing a foreign server or a user mapping, the associated connections are closed. But note that if any connections are in use in the current local transaction, they are kept until the end of the transaction. Closed connections will be re-established when they are necessary by future queries using a foreign table.

建立到外部服务器的连接后,在本地或对应的远程会话退出之前,默认将保持该连接。要显式断开连接,可以禁用外部服务器的 keep_connections 选项,或使用 postgres_fdw_disconnectpostgres_fdw_disconnect_all 函数。例如,这些函数可用于关闭不再需要的连接,从而释放外部服务器上的连接。

Once a connection to a foreign server has been established, it’s by default kept until the local or corresponding remote session exits. To disconnect a connection explicitly, keep_connections option for a foreign server may be disabled, or postgres_fdw_disconnect and postgres_fdw_disconnect_all functions may be used. For example, these are useful to close connections that are no longer necessary, thereby releasing connections on the foreign server.

F.38.4. Transaction Management #

在引用外部服务器上任一远程表的查询期间,如果当前本地事务未对应打开一个远程事务,postgres_fdw 将在远程服务器上打开一个事务。当本地事务提交或中止时,远程事务将提交或中止。保存点通过创建对应的远程保存点进行类似管理。

During a query that references any remote tables on a foreign server, postgres_fdw opens a transaction on the remote server if one is not already open corresponding to the current local transaction. The remote transaction is committed or aborted when the local transaction commits or aborts. Savepoints are similarly managed by creating corresponding remote savepoints.

当本地事务具有 SERIALIZABLE 隔离级别时,远程事务使用 SERIALIZABLE 隔离级别;否则,它使用 REPEATABLE READ 隔离级别。此选择可确保如果查询对远程服务器执行多个表扫描,它将为所有扫描获取快照一致的结果。因此,即使由于其他活动导致远程服务器上发生并发更新,单个事务内的连续查询也将从远程服务器看到相同的数据。如果本地事务使用 SERIALIZABLEREPEATABLE READ 隔离级别,这种行为也是预期的,但对于 READ COMMITTED 本地事务,可能会令人意外。PostgreSQL 未来版本可能会修改这些规则。

The remote transaction uses SERIALIZABLE isolation level when the local transaction has SERIALIZABLE isolation level; otherwise it uses REPEATABLE READ isolation level. This choice ensures that if a query performs multiple table scans on the remote server, it will get snapshot-consistent results for all the scans. A consequence is that successive queries within a single transaction will see the same data from the remote server, even if concurrent updates are occurring on the remote server due to other activities. That behavior would be expected anyway if the local transaction uses SERIALIZABLE or REPEATABLE READ isolation level, but it might be surprising for a READ COMMITTED local transaction. A future PostgreSQL release might modify these rules.

请注意,准备远程事务进行两阶段提交目前不受 postgres_fdw 支持。

Note that it is currently not supported by postgres_fdw to prepare the remote transaction for two-phase commit.

F.38.5. Remote Query Optimization #

postgres_fdw 尝试优化远程查询以减少从外部服务器传输的数据量。这是通过向远程服务器发送查询 WHERE 子句进行执行,以及不检索当前查询不需要的表列来完成的。为了减少错误执行查询的风险,WHERE 子句不会发送到远程服务器,除非它们仅使用内置数据类型、运算符和函数,或属于外部服务器 extensions 选项中列出的某个扩展程序。此类子句中的运算符和函数也必须 IMMUTABLE。针对 UPDATEDELETE 查询,如果不存在无法发送到远程服务器的查询 WHERE 子句、查询不存在本地连接、对目标表不存在行级本地 BEFOREAFTER 触发器或存储的生成列,以及不存在来自父视图的 CHECK OPTION 约束,postgres_fdw 尝试通过向远程服务器发送整个查询来优化查询执行。在 UPDATE 中,分配给目标列的表达式只能使用内置数据类型、IMMUTABLE 运算符或 IMMUTABLE 函数,以减少错误执行查询的风险。

postgres_fdw attempts to optimize remote queries to reduce the amount of data transferred from foreign servers. This is done by sending query WHERE clauses to the remote server for execution, and by not retrieving table columns that are not needed for the current query. To reduce the risk of misexecution of queries, WHERE clauses are not sent to the remote server unless they use only data types, operators, and functions that are built-in or belong to an extension that’s listed in the foreign server’s extensions option. Operators and functions in such clauses must be IMMUTABLE as well. For an UPDATE or DELETE query, postgres_fdw attempts to optimize the query execution by sending the whole query to the remote server if there are no query WHERE clauses that cannot be sent to the remote server, no local joins for the query, no row-level local BEFORE or AFTER triggers or stored generated columns on the target table, and no CHECK OPTION constraints from parent views. In UPDATE, expressions to assign to target columns must use only built-in data types, IMMUTABLE operators, or IMMUTABLE functions, to reduce the risk of misexecution of the query.

postgres_fdw 在同一外部服务器上遇到外部表之间的连接时,它将发送整个连接到外部服务器,除非出于某种原因它认为分别从每个表中获取行更有效,或涉及的表引用受不同用户映射的约束。在发送 JOIN 子句时,它会针对 WHERE 子句采取与上述相同的预防措施。

When postgres_fdw encounters a join between foreign tables on the same foreign server, it sends the entire join to the foreign server, unless for some reason it believes that it will be more efficient to fetch rows from each table individually, or unless the table references involved are subject to different user mappings. While sending the JOIN clauses, it takes the same precautions as mentioned above for the WHERE clauses.

可以使用 EXPLAIN VERBOSE 检查实际发送到远程服务器以进行执行的查询。

The query that is actually sent to the remote server for execution can be examined using EXPLAIN VERBOSE.

F.38.6. Remote Query Execution Environment #

在由 postgres_fdw 打开的远程会话中, search_path 参数仅设置为 pg_catalog ,以便在没有模式限定的情况下只能看到内置对象。对于由 postgres_fdw 本身生成的查询来说,这不是问题,因为它总是提供此类限定条件。然而,这可能会对通过远程表上的触发器或规则在远程服务器上执行的函数构成威胁。例如,如果远程表实际上是视图,那么在该视图中使用的任何函数都将使用受限的搜索路径执行。建议对此类函数中的所有名称加上模式限定,或者向此类函数附加 SET search_path 选项(请参阅 CREATE FUNCTION )以建立它们预期的搜索路径环境。

In the remote sessions opened by postgres_fdw, the search_path parameter is set to just pg_catalog, so that only built-in objects are visible without schema qualification. This is not an issue for queries generated by postgres_fdw itself, because it always supplies such qualification. However, this can pose a hazard for functions that are executed on the remote server via triggers or rules on remote tables. For example, if a remote table is actually a view, any functions used in that view will be executed with the restricted search path. It is recommended to schema-qualify all names in such functions, or else attach SET search_path options (see CREATE FUNCTION) to such functions to establish their expected search path environment.

postgres_fdw 同样为各种参数建立远程会话设置:

postgres_fdw likewise establishes remote session settings for various parameters:

search_path 相比,这些参数不太可能出现问题,但如果需要,可以使用函数 SET 选项处理它们。

These are less likely to be problematic than search_path, but can be handled with function SET options if the need arises.

建议您通过更改这些参数的会话级设置来覆盖此行为;这可能会导致 postgres_fdw 出现故障。

It is not recommended that you override this behavior by changing the session-level settings of these parameters; that is likely to cause postgres_fdw to malfunction.

F.38.7. Cross-Version Compatibility #

postgres_fdw 可用于早在 PostgreSQL 8.3 时的远程服务器。可追溯到 8.1 的只读功能。然而,有一个限制,即 postgres_fdw 通常假设不可变的内置函数和运算符可以安全地发送到远程服务器执行,如果它们出现在外部表的 WHERE 子句中。因此,在远程服务器发布后添加的内置函数可能会被发送到远程服务器执行,从而导致“函数不存在”或类似错误。可以通过重写查询来解决此类故障,例如将外部表引用嵌入到子 SELECT 中,并使用 OFFSET 0 作为优化标识,并将有问题的函数或运算符放在子 SELECT 之外。

postgres_fdw can be used with remote servers dating back to PostgreSQL 8.3. Read-only capability is available back to 8.1. A limitation however is that postgres_fdw generally assumes that immutable built-in functions and operators are safe to send to the remote server for execution, if they appear in a WHERE clause for a foreign table. Thus, a built-in function that was added since the remote server’s release might be sent to it for execution, resulting in “function does not exist” or a similar error. This type of failure can be worked around by rewriting the query, for example by embedding the foreign table reference in a sub-SELECT with OFFSET 0 as an optimization fence, and placing the problematic function or operator outside the sub-SELECT.

F.38.8. Configuration Parameters #

  • postgres_fdw.application_name (string) #

    • Specifies a value for application_name configuration parameter used when postgres_fdw establishes a connection to a foreign server. This overrides application_name option of the server object. Note that change of this parameter doesn’t affect any existing connections until they are re-established.

    • postgres_fdw.application_name can be any string of any length and contain even non-ASCII characters. However when it’s passed to and used as application_name in a foreign server, note that it will be truncated to less than NAMEDATALEN characters. Anything other than printable ASCII characters are replaced with C-style hexadecimal escapes. See application_name for details.

    • % characters begin “escape sequences” that are replaced with status information as outlined below. Unrecognized escapes are ignored. Other characters are copied straight to the application name. Note that it’s not allowed to specify a plus/minus sign or a numeric literal after the % and before the option, for alignment and padding.

    • For example, suppose user local_user establishes a connection from database local_db to foreign_db as user foreign_user, the setting 'db=%d, user=%u' is replaced with 'db=local_db, user=local_user'.

F.38.9. Examples #

以下是在 postgres_fdw 中创建外部表的示例。首先安装扩展程序:

Here is an example of creating a foreign table with postgres_fdw. First install the extension:

CREATE EXTENSION postgres_fdw;

然后使用 CREATE SERVER 创建外部服务器。在此示例中,我们要连接到监听端口 5432 上的主机 192.83.123.89 上的 PostgreSQL 服务器。在远程服务器上,建立连接时会使用数据库 foreign_db 的名称:

Then create a foreign server using CREATE SERVER. In this example we wish to connect to a PostgreSQL server on host 192.83.123.89 listening on port 5432. The database to which the connection is made is named foreign_db on the remote server:

CREATE SERVER foreign_server
        FOREIGN DATA WRAPPER postgres_fdw
        OPTIONS (host '192.83.123.89', port '5432', dbname 'foreign_db');

还需要使用 CREATE USER MAPPING 定义用户映射,以标识将在远程服务器上使用的角色:

A user mapping, defined with CREATE USER MAPPING, is needed as well to identify the role that will be used on the remote server:

CREATE USER MAPPING FOR local_user
        SERVER foreign_server
        OPTIONS (user 'foreign_user', password 'password');

现在可以使用 CREATE FOREIGN TABLE 创建外部表。在此示例中,我们希望访问远程服务器上名为 some_schema.some_table 的表。它在本地表中的名称为 foreign_table

Now it is possible to create a foreign table with CREATE FOREIGN TABLE. In this example we wish to access the table named some_schema.some_table on the remote server. The local name for it will be foreign_table:

CREATE FOREIGN TABLE foreign_table (
        id integer NOT NULL,
        data text
)
        SERVER foreign_server
        OPTIONS (schema_name 'some_schema', table_name 'some_table');

CREATE FOREIGN TABLE 中声明的列的数据类型和其他属性必须与实际远程表相匹配。列名也必须相匹配,除非你为各个列附加 column_name 选项以显示它们在远程表中的命名方式。在很多情况下,使用 IMPORT FOREIGN SCHEMA 比手动构造外部表定义更好。

It’s essential that the data types and other properties of the columns declared in CREATE FOREIGN TABLE match the actual remote table. Column names must match as well, unless you attach column_name options to the individual columns to show how they are named in the remote table. In many cases, use of IMPORT FOREIGN SCHEMA is preferable to constructing foreign table definitions manually.