Postgresql 中文操作指南
27.4. Hot Standby #
热备用是用来描述连接到服务器和运行只读查询的能力的术语,当服务器处于归档恢复或待机模式时。这对复制目的和以极高的精度将备份恢复到所需状态都很有用。热备用术语也指当用户继续运行查询和/或让连接保持打开时,服务器从恢复转向正常操作的能力。
Hot standby is the term used to describe the ability to connect to the server and run read-only queries while the server is in archive recovery or standby mode. This is useful both for replication purposes and for restoring a backup to a desired state with great precision. The term hot standby also refers to the ability of the server to move from recovery through to normal operation while users continue running queries and/or keep their connections open.
在热备用模式下运行查询类似于正常查询操作,尽管有如下解释的几个使用和管理差异。
Running queries in hot standby mode is similar to normal query operation, though there are several usage and administrative differences explained below.
27.4.1. User’s Overview #
当在备用服务器上将 hot_standby 参数设置为 true 时,一旦恢复使系统恢复到一致状态,它将开始接受连接。所有此类连接都是严格的只读连接;甚至连临时表都不能写入。
When the hot_standby parameter is set to true on a standby server, it will begin accepting connections once the recovery has brought the system to a consistent state. All such connections are strictly read-only; not even temporary tables may be written.
主备服务器之间的数据传输需要一点时间,因此主备之间存在着可测量的延迟。因此,同时在主备服务器上运行相同的查询可能返回不同的结果。我们称备库上的数据与主库存在 eventually consistent 关系。一旦事务的提交记录在备库上重放,该事务所做的更改就会在备库上的任何新快照中可见。根据当前的事务隔离级别,快照可以在每个查询或每个事务开始时创建。详情请参阅 Section 13.2 。
The data on the standby takes some time to arrive from the primary server so there will be a measurable delay between primary and standby. Running the same query nearly simultaneously on both primary and standby might therefore return differing results. We say that data on the standby is eventually consistent with the primary. Once the commit record for a transaction is replayed on the standby, the changes made by that transaction will be visible to any new snapshots taken on the standby. Snapshots may be taken at the start of each query or at the start of each transaction, depending on the current transaction isolation level. For more details, see Section 13.2.
在热备用期间启动的事务可能会发出下列命令:
Transactions started during hot standby may issue the following commands:
在热备用期间启动的事务永远不会被分配一个事务 ID,且不能写入系统预写日志。因此,以下动作将会产生错误信息:
Transactions started during hot standby will never be assigned a transaction ID and cannot write to the system write-ahead log. Therefore, the following actions will produce error messages:
在普通操作中,允许“只读”事务使用 LISTEN 和 NOTIFY,因此热备用会话在比普通只读会话稍严格的限制下操作。有些限制可能会在未来的版本中放宽。
In normal operation, “read-only” transactions are allowed to use LISTEN and NOTIFY, so hot standby sessions operate under slightly tighter restrictions than ordinary read-only sessions. It is possible that some of these restrictions might be loosened in a future release.
在热备用期间,参数 transaction_read_only 始终为 true,且不能被更改。但只要没有尝试修改数据库,热备用期间的连接就会像任何其他数据库连接一样。如果发生故障转移或切换,数据库将会切换到正常处理模式。会话会保持连接,而服务器切换模式。一旦热备用完成,就可以启动可读写的交易(甚至是从热备用期间开始的会话中启动)。
During hot standby, the parameter transaction_read_only is always true and may not be changed. But as long as no attempt is made to modify the database, connections during hot standby will act much like any other database connection. If failover or switchover occurs, the database will switch to normal processing mode. Sessions will remain connected while the server changes mode. Once hot standby finishes, it will be possible to initiate read-write transactions (even from a session begun during hot standby).
用户可以通过发布 SHOW in_hot_standby 来确定热备用目前是否对自己的会话处于激活状态。(在早于 14 的服务器版本中, in_hot_standby 参数不存在;适用于较旧服务器的替代方法是 SHOW transaction_read_only。)此外,一组函数( Table 9.92 )允许用户访问有关备用服务器的信息。这些函数可用于编写能够感知数据库当前状态的程序。这些函数可用于监视恢复进度,或用于编写将数据库还原为特定状态的复杂程序。
Users can determine whether hot standby is currently active for their session by issuing SHOW in_hot_standby. (In server versions before 14, the in_hot_standby parameter did not exist; a workable substitute method for older servers is SHOW transaction_read_only.) In addition, a set of functions (Table 9.92) allow users to access information about the standby server. These allow you to write programs that are aware of the current state of the database. These can be used to monitor the progress of recovery, or to allow you to write complex programs that restore the database to particular states.
27.4.2. Handling Query Conflicts #
主备服务器在很多方面是松散连接的。主服务器上的操作将对备用服务器产生影响。结果,它们之间有可能出现负面交互或冲突。最容易理解的冲突是性能:如果在主服务器上进行巨大的数据加载,这将会在备用服务器上生成类似的 WAL 记录流,所以备用查询可能会争用系统资源,例如 I/O。
The primary and standby servers are in many ways loosely connected. Actions on the primary will have an effect on the standby. As a result, there is potential for negative interactions or conflicts between them. The easiest conflict to understand is performance: if a huge data load is taking place on the primary then this will generate a similar stream of WAL records on the standby, so standby queries may contend for system resources, such as I/O.
还有一些额外的冲突类型可以在热备用中发生。这些冲突 hard conflicts 在某种意义上,查询可能需要取消,并且在某些情况下,需要断开会话来解决它们。用户提供了几种处理这些冲突的方法。冲突案例包括:
There are also additional types of conflict that can occur with hot standby. These conflicts are hard conflicts in the sense that queries might need to be canceled and, in some cases, sessions disconnected to resolve them. The user is provided with several ways to handle these conflicts. Conflict cases include:
在主服务器上,这些案例仅仅导致等待;用户可能会选择取消任一冲突操作。然而,在备用服务器上没有选择:WAL 记录的操作已经在主服务器上发生了,所以备用服务器不能不应用它。此外,允许 WAL 应用无限期地等待可能是非常不希望的,因为备用服务器的状态将逐渐远落后于主服务器。因此,提供了一种机制来强制取消与要应用的 WAL 记录发生冲突的备用查询。
On the primary server, these cases simply result in waiting; and the user might choose to cancel either of the conflicting actions. However, on the standby there is no choice: the WAL-logged action already occurred on the primary so the standby must not fail to apply it. Furthermore, allowing WAL application to wait indefinitely may be very undesirable, because the standby’s state will become increasingly far behind the primary’s. Therefore, a mechanism is provided to forcibly cancel standby queries that conflict with to-be-applied WAL records.
问题情况的一个示例是主服务器上管理员运行 DROP TABLE,其操作当前正在备用服务器上对一个表进行查询。显然,如果 DROP TABLE 在备用服务器上得到应用,备用查询就不能继续。如果这种情况发生在主服务器上,DROP TABLE 会等到另一个查询完成。但是当 DROP TABLE 在主服务器上运行时,主服务器没有关于备用服务器上运行什么查询的信息,所以它不会等待任何这样的备用查询。WAL 更改记录在备用查询仍在运行时到达了备用服务器,导致了一场冲突。备用服务器必须要么延迟应用 WAL 记录(以及之后的任何事情),要么取消冲突查询,以便 DROP TABLE 能够应用。
An example of the problem situation is an administrator on the primary server running DROP TABLE on a table that is currently being queried on the standby server. Clearly the standby query cannot continue if the DROP TABLE is applied on the standby. If this situation occurred on the primary, the DROP TABLE would wait until the other query had finished. But when DROP TABLE is run on the primary, the primary doesn’t have information about what queries are running on the standby, so it will not wait for any such standby queries. The WAL change records come through to the standby while the standby query is still running, causing a conflict. The standby server must either delay application of the WAL records (and everything after them, too) or else cancel the conflicting query so that the DROP TABLE can be applied.
当存在冲突查询时,通常需要允许它通过延迟 WAL 应用稍微完成一点;但是通常不希望 WAL 应用延迟很长时间。因此,取消机制具有参数 max_standby_archive_delay 和 max_standby_streaming_delay,它们定义 WAL 应用中允许的最大延迟。一旦耗时超过 WAL 数据应用相关延迟设置,将取消冲突查询。有两个参数,因此可以分别针对从存档中读取 WAL 数据(例如,从基本备份初始恢复或“赶上”落后很多的备用服务器)和通过流复制读取 WAL 数据指定不同的延迟值。
When a conflicting query is short, it’s typically desirable to allow it to complete by delaying WAL application for a little bit; but a long delay in WAL application is usually not desirable. So the cancel mechanism has parameters, max_standby_archive_delay and max_standby_streaming_delay, that define the maximum allowed delay in WAL application. Conflicting queries will be canceled once it has taken longer than the relevant delay setting to apply any newly-received WAL data. There are two parameters so that different delay values can be specified for the case of reading WAL data from an archive (i.e., initial recovery from a base backup or “catching up” a standby server that has fallen far behind) versus reading WAL data via streaming replication.
在主要用于高可用性的备用服务器中,最好相对减少延迟参数,以便服务器由于备用查询导致的延迟而不会落后于主服务器太多。但是,如果备用服务器用于执行需要很长时间的查询,则可能更喜欢高延时甚至无限延时值。但是,请记住,如果备用查询因延迟应用 WAL 记录而导致备用服务器上的其他会话看不到主服务器上的最新更改,则需要很长时间的查询可能会导致这些会话失败。
In a standby server that exists primarily for high availability, it’s best to set the delay parameters relatively short, so that the server cannot fall far behind the primary due to delays caused by standby queries. However, if the standby server is meant for executing long-running queries, then a high or even infinite delay value may be preferable. Keep in mind however that a long-running query could cause other sessions on the standby server to not see recent changes on the primary, if it delays application of WAL records.
在超过 max_standby_archive_delay 或 max_standby_streaming_delay 指定的延迟后,将取消冲突的查询。尽管在重放 DROP DATABASE 的情况下,整个冲突会话将被终止,但通常只会导致取消错误。此外,如果冲突是因闲置事务持有的锁而发生的,则将终止冲突会话(此行为可能会在将来更改)。
Once the delay specified by max_standby_archive_delay or max_standby_streaming_delay has been exceeded, conflicting queries will be canceled. This usually results just in a cancellation error, although in the case of replaying a DROP DATABASE the entire conflicting session will be terminated. Also, if the conflict is over a lock held by an idle transaction, the conflicting session is terminated (this behavior might change in the future).
取消的查询可以立即重试(当然,在开始新事务后)。由于取消查询取决于正在重放的 WAL 记录的性质,所以如果再次执行已取消的查询,该查询很可能成功。
Canceled queries may be retried immediately (after beginning a new transaction, of course). Since query cancellation depends on the nature of the WAL records being replayed, a query that was canceled may well succeed if it is executed again.
请记住,延迟参数与备用服务器接收 WAL 数据以来经过的时间进行比较。因此,备用服务器上任何一个查询允许的宽限期永远不会超过延迟参数,如果备用服务器由于等待先前查询完成或由于无法跟上繁重更新负载而已经落后,则宽限期可能会短得多。
Keep in mind that the delay parameters are compared to the elapsed time since the WAL data was received by the standby server. Thus, the grace period allowed to any one query on the standby is never more than the delay parameter, and could be considerably less if the standby has already fallen behind as a result of waiting for previous queries to complete, or as a result of being unable to keep up with a heavy update load.
备用查询和 WAL 重放之间发生冲突的最常见原因是“早期清理”。通常,当根据 MVCC 规则没有需要看到旧行版本的交易来确保数据的正确可见性时,PostgreSQL 允许清理旧行版本。但是,此规则只能应用于在主服务器上执行的交易。因此,主服务器上的清理可能会删除备用服务器上的交易仍可见的行版本。
The most common reason for conflict between standby queries and WAL replay is “early cleanup”. Normally, PostgreSQL allows cleanup of old row versions when there are no transactions that need to see them to ensure correct visibility of data according to MVCC rules. However, this rule can only be applied for transactions executing on the primary. So it is possible that cleanup on the primary will remove row versions that are still visible to a transaction on the standby.
行版本清理并不是与备用查询发生冲突的唯一潜在原因。所有仅索引扫描(包括在备用服务器上运行的扫描)都必须使用与可见性图“一致”的 MVCC 快照。因此,只要 not 包含一个或多个对所有备用查询都可见的行,就会导致发生冲突。因此,即使对没有 wymagających czyszczenia 更新或已删除行的数据表运行 VACUUM,也会导致产生冲突。
Row version cleanup isn’t the only potential cause of conflicts with standby queries. All index-only scans (including those that run on standbys) must use an MVCC snapshot that “agrees” with the visibility map. Conflicts are therefore required whenever VACUUM sets a page as all-visible in the visibility map containing one or more rows not visible to all standby queries. So even running VACUUM against a table with no updated or deleted rows requiring cleanup might lead to conflicts.
用户应该明确,在主服务器上定期且频繁更新的数据表将很快导致备用服务器上需要较长时间的查询被取消。在这种情况下,可以将 max_standby_archive_delay 或 max_standby_streaming_delay 设置为有限值,这与设置 statement_timeout 类似。
Users should be clear that tables that are regularly and heavily updated on the primary server will quickly cause cancellation of longer running queries on the standby. In such cases the setting of a finite value for max_standby_archive_delay or max_standby_streaming_delay can be considered similar to setting statement_timeout.
如果发现备用查询取消的数量不可接受,则存在补救可能性。第一个选项是设置参数 hot_standby_feedback,该参数可以防止 VACUUM 删除最近消失的行,因此不会发生清理冲突。如果您执行此操作,则应注意,这将延迟主服务器上死行的清理,这可能会导致不希望的表膨胀。但是,清理情况不会比备用查询直接在主服务器上运行更糟,并且您仍然可以从将执行卸载到备用服务器中受益。如果备用服务器频繁连接和断开连接,则您可能需要进行调整以处理没有提供 hot_standby_feedback 反馈的时间段。例如,考虑增加 max_standby_archive_delay,以便在断开连接期间不会因 WAL 存档文件中的冲突而迅速取消查询。您还应考虑增加 max_standby_streaming_delay,以免在重新连接后因新到达的流式 WAL 条目而迅速取消查询。
Remedial possibilities exist if the number of standby-query cancellations is found to be unacceptable. The first option is to set the parameter hot_standby_feedback, which prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur. If you do this, you should note that this will delay cleanup of dead rows on the primary, which may result in undesirable table bloat. However, the cleanup situation will be no worse than if the standby queries were running directly on the primary server, and you are still getting the benefit of off-loading execution onto the standby. If standby servers connect and disconnect frequently, you might want to make adjustments to handle the period when hot_standby_feedback feedback is not being provided. For example, consider increasing max_standby_archive_delay so that queries are not rapidly canceled by conflicts in WAL archive files during disconnected periods. You should also consider increasing max_standby_streaming_delay to avoid rapid cancellations by newly-arrived streaming WAL entries after reconnection.
可以使用备用服务器上的 pg_stat_database_conflicts 系统视图查看查询取消的数量及其原因。pg_stat_database 系统视图也包含汇总信息。
The number of query cancels and the reason for them can be viewed using the pg_stat_database_conflicts system view on the standby server. The pg_stat_database system view also contains summary information.
用户可以控制当 WAL 重放由于冲突而等待时间超过 deadlock_timeout 时是否输出日志消息。这是由 log_recovery_conflict_waits 参数控制的。
Users can control whether a log message is produced when WAL replay is waiting longer than deadlock_timeout for conflicts. This is controlled by the log_recovery_conflict_waits parameter.
27.4.3. Administrator’s Overview #
如果 hot_standby 在 postgresql.conf 中是 on (默认值)并且存在 standby.signal 文件,服务器将在热备用模式下运行。但是,可能需要一些时间来允许热备用连接,因为服务器在完成足够的恢复以提供稳定状态后才会接受连接,且该稳定状态中的查询可以运行。在此期间,尝试连接的客户端会因错误消息而被拒绝。要确认服务器已启动,可以循环尝试从应用程序连接,或在服务器日志中查找以下消息:
If hot_standby is on in postgresql.conf (the default value) and there is a standby.signal file present, the server will run in hot standby mode. However, it may take some time for hot standby connections to be allowed, because the server will not accept connections until it has completed sufficient recovery to provide a consistent state against which queries can run. During this period, clients that attempt to connect will be refused with an error message. To confirm the server has come up, either loop trying to connect from the application, or look for these messages in the server logs:
LOG: entering standby mode
... then some time later ...
LOG: consistent recovery state reached
LOG: database system is ready to accept read-only connections
一致性信息在主服务器上每个检查点记录一次。当在主服务器上阅读在 wal_level 未设置为 replica 或 logical 期间写入的 WAL 时,无法启用热备用。在以下两种情况下,也可能会延迟达到一致状态:
Consistency information is recorded once per checkpoint on the primary. It is not possible to enable hot standby when reading WAL written during a period when wal_level was not set to replica or logical on the primary. Reaching a consistent state can also be delayed in the presence of both of these conditions:
如果您正在运行基于文件的日志传送(“暖备用”),则您可能需要等到下一个 WAL 文件到达,这可能与主服务器中的 archive_timeout 设置一样长。
If you are running file-based log shipping ("warm standby"), you might need to wait until the next WAL file arrives, which could be as long as the archive_timeout setting on the primary.
某些参数的设置决定了用于跟踪事务 ID、锁和已准备事务的共享内存的大小。这些共享内存结构在备用服务器中的大小不得小于主服务器中的大小,以确保备用服务器在恢复期间不耗尽共享内存。例如,如果主服务器曾经使用已准备的事务,但备用服务器没有为跟踪已准备的事务分配任何共享内存,则在更改备用服务器的配置之前,恢复将无法继续。受影响的参数有:
The settings of some parameters determine the size of shared memory for tracking transaction IDs, locks, and prepared transactions. These shared memory structures must be no smaller on a standby than on the primary in order to ensure that the standby does not run out of shared memory during recovery. For example, if the primary had used a prepared transaction but the standby had not allocated any shared memory for tracking prepared transactions, then recovery could not continue until the standby’s configuration is changed. The parameters affected are:
确保这不会成为问题的最简单方法是将这些参数在备用服务器上设置为等于或大于主服务器上的值。因此,如果您要增加这些值,则应先在所有备用服务器上执行此操作,然后再将更改应用于主服务器。相反,如果您要减小这些值,则应先在主服务器上执行此操作,然后再将更改应用于所有备用服务器。请记住,当备用服务器被升级时,它将成为其后续备用服务器所需的此参数设置的新参考。因此,建议在切换或故障转移期间避免其成为问题,即保持所有备用服务器上的这些设置相同。
The easiest way to ensure this does not become a problem is to have these parameters set on the standbys to values equal to or greater than on the primary. Therefore, if you want to increase these values, you should do so on all standby servers first, before applying the changes to the primary server. Conversely, if you want to decrease these values, you should do so on the primary server first, before applying the changes to all standby servers. Keep in mind that when a standby is promoted, it becomes the new reference for the required parameter settings for the standbys that follow it. Therefore, to avoid this becoming a problem during a switchover or failover, it is recommended to keep these settings the same on all standby servers.
WAL 跟踪对主服务器上这些参数的更改。如果热备用处理 WAL,指示主服务器上的当前值高于其自身值,它将记录一条警告并暂停恢复,例如:
The WAL tracks changes to these parameters on the primary. If a hot standby processes WAL that indicates that the current value on the primary is higher than its own value, it will log a warning and pause recovery, for example:
WARNING: hot standby is not possible because of insufficient parameter settings
DETAIL: max_connections = 80 is a lower setting than on the primary server, where its value was 100.
LOG: recovery has paused
DETAIL: If recovery is unpaused, the server will shut down.
HINT: You can then restart the server after making the necessary configuration changes.
此时,需要更新备用服务器上的设置并在恢复继续之前重新启动实例。如果备用服务器不是热备用服务器,那么当它遇到不兼容的参数更改时,它将立即关闭,而不会暂停,因为没有价值来保持它继续运行。
At that point, the settings on the standby need to be updated and the instance restarted before recovery can continue. If the standby is not a hot standby, then when it encounters the incompatible parameter change, it will shut down immediately without pausing, since there is then no value in keeping it up.
非常重要的是让管理员为 max_standby_archive_delay 和 max_standby_streaming_delay 选择合适的设置。最佳选择取决于业务优先级。例如,如果服务器主要充当高可用性服务器,那么你将希望延迟设置比较低(甚至是 0),尽管这是一个非常激进的设置。如果备用服务器充当决策支持查询的备用服务器,那么将最大延迟值设置为很多小时甚至 -1(即永远等待查询完成)可能是可以接受的。
It is important that the administrator select appropriate settings for max_standby_archive_delay and max_standby_streaming_delay. The best choices vary depending on business priorities. For example if the server is primarily tasked as a High Availability server, then you will want low delay settings, perhaps even zero, though that is a very aggressive setting. If the standby server is tasked as an additional server for decision support queries then it might be acceptable to set the maximum delay values to many hours, or even -1 which means wait forever for queries to complete.
在主服务器上写入的交易状态“提示位”不会记录在 WAL 中,因此备用服务器上的数据可能在备用服务器上再次重写提示。因此,即使所有用户都是只读的,备用服务器仍将执行磁盘写入;数据值本身不会发生任何变化。用户仍然会写入大型排序临时文件并重新生成 relcache 信息文件,因此在热备份模式下,数据库的任何部分都不是真正的只读。另请注意,即使交易在本地是只读的,仍可以使用 dblink 模块对远程数据库进行写入,以及使用 PL 函数进行数据库外部的其他操作。
Transaction status "hint bits" written on the primary are not WAL-logged, so data on the standby will likely re-write the hints again on the standby. Thus, the standby server will still perform disk writes even though all users are read-only; no changes occur to the data values themselves. Users will still write large sort temporary files and re-generate relcache info files, so no part of the database is truly read-only during hot standby mode. Note also that writes to remote databases using dblink module, and other operations outside the database using PL functions will still be possible, even though the transaction is read-only locally.
在恢复模式中不接受以下类型的管理命令:
The following types of administration commands are not accepted during recovery mode:
请注意,其中一些命令实际上可以在主数据库上“只读”模式事务期间执行。
Again, note that some of these commands are actually allowed during "read only" mode transactions on the primary.
因此,你无法创建仅存在于备用数据库上的附加索引,也无法创建仅存在于备用数据库上的统计信息。如果需要这些管理命令,应在主数据库上执行它们,这些更改最终会传播到备用数据库。
As a result, you cannot create additional indexes that exist solely on the standby, nor statistics that exist solely on the standby. If these administration commands are needed, they should be executed on the primary, and eventually those changes will propagate to the standby.
pg_cancel_backend() 和 pg_terminate_backend()_适用于用户后端,但不适用于执行恢复的启动进程。_pg_stat_activity 不会显示正在恢复的事务是活动的事务。因此,pg_prepared_xacts 在恢复期间始终为空。如果你希望在有疑问时解析已准备好的事务,请在主数据库上查看 pg_prepared_xacts 并发出命令解析事务或在恢复结束之后解析事务。
pg_cancel_backend() and pg_terminate_backend() will work on user backends, but not the startup process, which performs recovery. pg_stat_activity does not show recovering transactions as active. As a result, pg_prepared_xacts is always empty during recovery. If you wish to resolve in-doubt prepared transactions, view pg_prepared_xacts on the primary and issue commands to resolve transactions there or resolve them after the end of recovery.
pg_locks_会正常显示后端持有的锁。_pg_locks 还会显示由启动进程管理的虚拟事务,该事务拥有由恢复重放的事务持有的所有 AccessExclusiveLocks。请注意,启动进程不会获取锁来进行数据库更改,因此,对于启动进程,其他类型的锁(除了 AccessExclusiveLocks)不会显示在 _pg_locks_中;它们只是被假定存在。
pg_locks will show locks held by backends, as normal. pg_locks also shows a virtual transaction managed by the startup process that owns all AccessExclusiveLocks held by transactions being replayed by recovery. Note that the startup process does not acquire locks to make database changes, and thus locks other than AccessExclusiveLocks do not show in pg_locks for the Startup process; they are just presumed to exist.
Nagios 插件 check_pgsql 将正常工作,因为它检查的简单信息存在。check_postgres 监控脚本也将正常工作,尽管一些报告值可能会给出现差异或令人困惑的结果。例如,不会保留上次 vacuum 时间,因为在备用数据库上不会发生 vacuum。在主数据库上运行的 vacuum 仍然会将它们的更改发送至备用数据库。
The Nagios plugin check_pgsql will work, because the simple information it checks for exists. The check_postgres monitoring script will also work, though some reported values could give different or confusing results. For example, last vacuum time will not be maintained, since no vacuum occurs on the standby. Vacuums running on the primary do still send their changes to the standby.
WAL 文件控制命令在恢复期间不会工作,例如 pg_backup_start、pg_switch_wal 等。
WAL file control commands will not work during recovery, e.g., pg_backup_start, pg_switch_wal etc.
动态加载模块正常工作,包括 pg_stat_statements。
Dynamically loadable modules work, including pg_stat_statements.
咨询锁在恢复期间正常工作,包括死锁检测。请注意,咨询锁决不会经 WAL 记录,因此主数据库或备用数据库上的咨询锁与 WAL 重放发生冲突是不可能的。同样也不可能在主数据库上获取咨询锁并在备用数据库上启动类似的咨询锁。咨询锁仅与获取它们的服务器相关。
Advisory locks work normally in recovery, including deadlock detection. Note that advisory locks are never WAL logged, so it is impossible for an advisory lock on either the primary or the standby to conflict with WAL replay. Nor is it possible to acquire an advisory lock on the primary and have it initiate a similar advisory lock on the standby. Advisory locks relate only to the server on which they are acquired.
基于触发器的复制系统(例如 Slony、Londiste 和 Bucardo)根本不会在备用数据库上运行,尽管它们可以在主服务器上顺利运行,只要未将更改发送至备用服务器进行应用即可。WAL 重放不是基于触发器的,因此你无法从备用数据库传递至任何需要附加数据库写入或依赖使用触发器的系统。
Trigger-based replication systems such as Slony, Londiste and Bucardo won’t run on the standby at all, though they will run happily on the primary server as long as the changes are not sent to standby servers to be applied. WAL replay is not trigger-based so you cannot relay from the standby to any system that requires additional database writes or relies on the use of triggers.
无法分配新 OID,尽管只要几个 UUID 生成器不依赖于向数据库写入新状态,它们仍然可能正常工作。
New OIDs cannot be assigned, though some UUID generators may still work as long as they do not rely on writing new status to the database.
目前,在只读事务期间不允许创建临时表,因此,在某些情况下,现有脚本将无法正确运行。此限制可能在以后的版本中放宽。这既是 SQL 标准合规性问题,也是技术问题。
Currently, temporary table creation is not allowed during read-only transactions, so in some cases existing scripts will not run correctly. This restriction might be relaxed in a later release. This is both an SQL standard compliance issue and a technical issue.
DROP TABLESPACE 仅当表空间为空时才能成功。一些备用用户可能通过其 temp_tablespaces 参数积极使用表空间。如果表空间中有临时文件,所有活动查询将被取消以确保删除临时文件,以便可以删除表空间并继续 WAL 重放。
DROP TABLESPACE can only succeed if the tablespace is empty. Some standby users may be actively using the tablespace via their temp_tablespaces parameter. If there are temporary files in the tablespace, all active queries are canceled to ensure that temporary files are removed, so the tablespace can be removed and WAL replay can continue.
在主数据库上运行 DROP DATABASE 或 ALTER DATABASE … SET TABLESPACE 将生成一个 WAL 条目,这将导致连接到备用数据库上该数据库的所有用户被强行断开连接。无论 max_standby_streaming_delay 的设置如何,此操作都会立即发生。请注意,ALTER DATABASE … RENAME 不会断开用户的连接,这在大多数情况下都不会被注意到,尽管在某些情况下,如果它以某种方式依赖于数据库名称,它可能会导致程序混乱。
Running DROP DATABASE or ALTER DATABASE … SET TABLESPACE on the primary will generate a WAL entry that will cause all users connected to that database on the standby to be forcibly disconnected. This action occurs immediately, whatever the setting of max_standby_streaming_delay. Note that ALTER DATABASE … RENAME does not disconnect users, which in most cases will go unnoticed, though might in some cases cause a program confusion if it depends in some way upon database name.
在正常(非恢复)模式下,如果你为具有登录功能的角色发出 DROP USER 或 DROP ROLE,而该用户仍在连接时,则不会对已连接的用户产生任何影响——他们仍保持连接。然而,用户无法重新连接。此行为也适用于恢复状态,因此主数据库上的 DROP USER 不会断开备用数据库上的该用户。
In normal (non-recovery) mode, if you issue DROP USER or DROP ROLE for a role with login capability while that user is still connected then nothing happens to the connected user — they remain connected. The user cannot reconnect however. This behavior applies in recovery also, so a DROP USER on the primary does not disconnect that user on the standby.
累积统计系统在恢复期间处于活动状态。所有扫描、读取、块、索引使用情况等将在备用数据库上正常记录。然而,WAL 重放不会增加关系和数据库特定计数器。即重放不会增量 pg_stat_all_tables 列(例如 n_tup_ins),启动进程执行的读取或写入也不会在 pg_statio 视图中被跟踪,也不会增量相关的 pg_stat_database 列。
The cumulative statistics system is active during recovery. All scans, reads, blocks, index usage, etc., will be recorded normally on the standby. However, WAL replay will not increment relation and database specific counters. I.e. replay will not increment pg_stat_all_tables columns (like n_tup_ins), nor will reads or writes performed by the startup process be tracked in the pg_statio views, nor will associated pg_stat_database columns be incremented.
自动 vacuum 在恢复期间不处于活动状态。它将在恢复结束时正常启动。
Autovacuum is not active during recovery. It will start normally at the end of recovery.
检查点进程和后台写入进程在恢复期间处于活动状态。检查点进程将执行重启点(类似于主数据库上的检查点),后台写入进程将执行正常的块清理活动。这可能包括更新存储在备用服务器上的提示位信息的活动。虽然 CHECKPOINT 命令在恢复期间会被接受,但它会执行重启点而不是新的检查点。
The checkpointer process and the background writer process are active during recovery. The checkpointer process will perform restartpoints (similar to checkpoints on the primary) and the background writer process will perform normal block cleaning activities. This can include updates of the hint bit information stored on the standby server. The CHECKPOINT command is accepted during recovery, though it performs a restartpoint rather than a new checkpoint.
27.4.4. Hot Standby Parameter Reference #
已经在 Section 27.4.2 和 Section 27.4.3 中提到了各种参数。
Various parameters have been mentioned above in Section 27.4.2 and Section 27.4.3.
在主服务器上,可以使用 wal_level 参数。如果在主服务器上设置 max_standby_archive_delay 和 max_standby_streaming_delay,它们将不起作用。
On the primary, the wal_level parameter can be used. max_standby_archive_delay and max_standby_streaming_delay have no effect if set on the primary.
在备用服务器上,可以使用参数 hot_standby、 max_standby_archive_delay 和 max_standby_streaming_delay。
On the standby, parameters hot_standby, max_standby_archive_delay and max_standby_streaming_delay can be used.
27.4.5. Caveats #
热备有一些限制。可以并且可能在未来版本中修复这些限制:
There are several limitations of hot standby. These can and probably will be fixed in future releases: