Postgresql 中文操作指南

30.4. Asynchronous Commit #

Asynchronous commit 允许事务更快完成,代价是,如果数据库崩溃,则最近的事务可能会丢失。在许多应用程序中,这是一个可接受的权衡。

Asynchronous commit is an option that allows transactions to complete more quickly, at the cost that the most recent transactions may be lost if the database should crash. In many applications this is an acceptable trade-off.

如前一节中所述,事务提交通常是 synchronous:服务器等待将事务的 WAL 记录刷新到永久存储,然后向客户端返回成功指示。因此,即使在服务器立即崩溃的情况下,客户端也可以保证报告已提交的事务将会保留。然而,对于短事务,此延迟是总事务时间的主要组成部分。选择异步提交模式意味着服务器在生成的 WAL 记录实际上到达磁盘之前,就会在事务逻辑完成时返回成功。这可以为小型事务大大提升吞吐量。

As described in the previous section, transaction commit is normally synchronous: the server waits for the transaction’s WAL records to be flushed to permanent storage before returning a success indication to the client. The client is therefore guaranteed that a transaction reported to be committed will be preserved, even in the event of a server crash immediately after. However, for short transactions this delay is a major component of the total transaction time. Selecting asynchronous commit mode means that the server returns success as soon as the transaction is logically completed, before the WAL records it generated have actually made their way to disk. This can provide a significant boost in throughput for small transactions.

异步提交引入了数据丢失风险。在向客户端报告事务完成和事务真正提交(即,保证在服务器崩溃时不会丢失)之间存在一个较短的时间窗口。因此,如果客户端将根据事务将被记住的假设采取外部操作,则不应当使用异步提交。举例来说,银行肯定不会针对记录 ATM 发放现金的事务使用异步提交。但是在许多场景(例如事件日志记录)中,不需要这种强有力的保证。

Asynchronous commit introduces the risk of data loss. There is a short time window between the report of transaction completion to the client and the time that the transaction is truly committed (that is, it is guaranteed not to be lost if the server crashes). Thus asynchronous commit should not be used if the client will take external actions relying on the assumption that the transaction will be remembered. As an example, a bank would certainly not use asynchronous commit for a transaction recording an ATM’s dispensing of cash. But in many scenarios, such as event logging, there is no need for a strong guarantee of this kind.

使用异步提交承担的风险是数据丢失,而不是数据损坏。如果数据库崩溃,它将通过重放到最后一次刷新的记录的 WAL 来恢复。因此,数据库将恢复为自洽状态,但任何尚未刷新到磁盘的事务将不会反映在该状态中。因此,净结果是丢失最后几个事务。因为按提交顺序重放事务,所以不会引入任何不一致性,例如,如果事务 B 根据先前事务 A 的结果进行更改,而 A 的结果在保留 B 的结果时丢失,则不可能。

The risk that is taken by using asynchronous commit is of data loss, not data corruption. If the database should crash, it will recover by replaying WAL up to the last record that was flushed. The database will therefore be restored to a self-consistent state, but any transactions that were not yet flushed to disk will not be reflected in that state. The net effect is therefore loss of the last few transactions. Because the transactions are replayed in commit order, no inconsistency can be introduced — for example, if transaction B made changes relying on the effects of a previous transaction A, it is not possible for A’s effects to be lost while B’s effects are preserved.

用户可以选择每项事务的提交模式,这样可以同时运行同步提交事务和异步提交事务。这使得在性能和交易耐用性的确定性之间做出灵活的权衡成为了可能。提交模式由可通过设置配置参数的任何方式进行更改的用户可设置参数 synchronous_commit 控制。对任何一项事务使用的模式取决于事务提交开始时的 synchronous_commit 值。

The user can select the commit mode of each transaction, so that it is possible to have both synchronous and asynchronous commit transactions running concurrently. This allows flexible trade-offs between performance and certainty of transaction durability. The commit mode is controlled by the user-settable parameter synchronous_commit, which can be changed in any of the ways that a configuration parameter can be set. The mode used for any one transaction depends on the value of synchronous_commit when transaction commit begins.

某些实用程序命令,例如 DROP TABLE,强制同步提交,无论 synchronous_commit 的设置如何。这是为了确保服务器文件系统和数据库的逻辑状态之间的一致性。支持两阶段提交的命令,例如 PREPARE TRANSACTION,也总是同步的。

Certain utility commands, for instance DROP TABLE, are forced to commit synchronously regardless of the setting of synchronous_commit. This is to ensure consistency between the server’s file system and the logical state of the database. The commands supporting two-phase commit, such as PREPARE TRANSACTION, are also always synchronous.

如果数据库在异步提交和写入事务 WAL 记录之间的风险窗口期间崩溃,则在该事务期间做出的更改 will 将丢失。风险窗口的持续时间有限,因为一个后台进程(“WAL 编写器”)将未写入的 WAL 记录每 wal_writer_delay 毫秒刷新到磁盘。风险窗口的实际最长持续时间为 wal_writer_delay 的三倍,因为 WAL 编写器设计为在繁忙期间一次写入整个页面。

If the database crashes during the risk window between an asynchronous commit and the writing of the transaction’s WAL records, then changes made during that transaction will be lost. The duration of the risk window is limited because a background process (the “WAL writer”) flushes unwritten WAL records to disk every wal_writer_delay milliseconds. The actual maximum duration of the risk window is three times wal_writer_delay because the WAL writer is designed to favor writing whole pages at a time during busy periods.

Caution

立即模式关闭等效于服务器崩溃,因此将导致丢失任何未刷新的异步提交。

An immediate-mode shutdown is equivalent to a server crash, and will therefore cause loss of any unflushed asynchronous commits.

不同步提交提供不同的行为,但不关闭 fsync。_fsync_是整个服务器的设置,它将改变所有事务的行为。它会禁用 PostgreSQL 中试图将写同步到数据库的不同部分的所有逻辑,因此系统崩溃(即硬件或操作系统崩溃,并非 PostgreSQL 本身故障)可能会导致数据库状态任意地严重损坏。在许多场景中,不同步提交提供的大部分性能提升是通过关闭 _fsync_获得的,但不具有数据损坏风险。

Asynchronous commit provides behavior different from setting fsync = off. fsync is a server-wide setting that will alter the behavior of all transactions. It disables all logic within PostgreSQL that attempts to synchronize writes to different portions of the database, and therefore a system crash (that is, a hardware or operating system crash, not a failure of PostgreSQL itself) could result in arbitrarily bad corruption of the database state. In many scenarios, asynchronous commit provides most of the performance improvement that could be obtained by turning off fsync, but without the risk of data corruption.

commit_delay听起来也和不同步提交非常类似,但它实际上是一个同步提交方法(事实上,在不同步提交期间 commit_delay 被忽略)。_commit_delay_会在事务将 WAL 冲洗到磁盘之前造成延迟,期望一次冲洗操作执行在一个事务中时也能为几乎同时提交的其他事务服务。该设置可以被视为一种增加时间窗口的方式,事务可以在该时间窗口中加入一个打算参与单次冲洗操作的组,以便在多个事务之间冲抵冲洗成本。

commit_delay also sounds very similar to asynchronous commit, but it is actually a synchronous commit method (in fact, commit_delay is ignored during an asynchronous commit). commit_delay causes a delay just before a transaction flushes WAL to disk, in the hope that a single flush executed by one such transaction can also serve other transactions committing at about the same time. The setting can be thought of as a way of increasing the time window in which transactions can join a group about to participate in a single flush, to amortize the cost of the flush among multiple transactions.