Postgresql 中文操作指南

49.9. Streaming of Large Transactions for Logical Decoding #

基本输出插件回调(例如 begin_cbchange_cbcommit_cbmessage_cb)仅在交易实际提交时调用。仍然从事务日志解码更改,但仅在提交时传递给输出插件(如果交易中止,则丢弃)。

The basic output plugin callbacks (e.g., begin_cb, change_cb, commit_cb and message_cb) are only invoked when the transaction actually commits. The changes are still decoded from the transaction log, but are only passed to the output plugin at commit (and discarded if the transaction aborts).

这意味着,虽然解码是逐步发生的,并且可能溢出到磁盘以控制内存使用,但是当事务最终提交(或更准确地说,当提交从事务日志解码)时,必须传输所有已解码的更改。根据事务的大小和网络带宽,传输时间可能会显著增加应用延迟。

This means that while the decoding happens incrementally, and may spill to disk to keep memory usage under control, all the decoded changes have to be transmitted when the transaction finally commits (or more precisely, when the commit is decoded from the transaction log). Depending on the size of the transaction and network bandwidth, the transfer time may significantly increase the apply lag.

为了减少由大事务引起的应用滞后,输出插件可能会提供其他回调来支持进行中的事务的增量流。有多个必需的流回调(stream_start_cbstream_stop_cbstream_abort_cbstream_commit_cb_和_stream_change_cb)和两个可选回调(stream_message_cb_和_stream_truncate_cb)。此外,如果要支持两阶段命令流,则必须提供其他回调。(有关详细信息,请参见 Section 49.10)。

To reduce the apply lag caused by large transactions, an output plugin may provide additional callback to support incremental streaming of in-progress transactions. There are multiple required streaming callbacks (stream_start_cb, stream_stop_cb, stream_abort_cb, stream_commit_cb and stream_change_cb) and two optional callbacks (stream_message_cb and stream_truncate_cb). Also, if streaming of two-phase commands is to be supported, then additional callbacks must be provided. (See Section 49.10 for details).

在流传输正在进行的事务时,更改(和消息)以由 stream_start_cbstream_stop_cb 回调分隔的块流传输。一旦传输了所有已解码的更改,便可以使用 stream_commit_cb 回调提交事务(或可能使用 stream_abort_cb 回调中止事务)。如果支持两阶段提交,则可以使用 stream_prepare_cb 回调准备事务,使用 commit_prepared_cb 回调提交事务 COMMIT PREPARED 或使用 rollback_prepared_cb 中止事务。

When streaming an in-progress transaction, the changes (and messages) are streamed in blocks demarcated by stream_start_cb and stream_stop_cb callbacks. Once all the decoded changes are transmitted, the transaction can be committed using the stream_commit_cb callback (or possibly aborted using the stream_abort_cb callback). If two-phase commits are supported, the transaction can be prepared using the stream_prepare_cb callback, COMMIT PREPARED using the commit_prepared_cb callback or aborted using the rollback_prepared_cb.

一次事务的一个流传输回调调用的示例序列可能如下所示:

One example sequence of streaming callback calls for one transaction may look like this:

stream_start_cb(...);   <-- start of first block of changes
  stream_change_cb(...);
  stream_change_cb(...);
  stream_message_cb(...);
  stream_change_cb(...);
  ...
  stream_change_cb(...);
stream_stop_cb(...);    <-- end of first block of changes

stream_start_cb(...);   <-- start of second block of changes
  stream_change_cb(...);
  stream_change_cb(...);
  stream_change_cb(...);
  ...
  stream_message_cb(...);
  stream_change_cb(...);
stream_stop_cb(...);    <-- end of second block of changes


[a. when using normal commit]
stream_commit_cb(...);    <-- commit of the streamed transaction

[b. when using two-phase commit]
stream_prepare_cb(...);   <-- prepare the streamed transaction
commit_prepared_cb(...);  <-- commit of the prepared transaction

当然,回调调用的实际序列可能更为复杂。可能存在多个流传输事务的块,有些事务可能被中止等。

The actual sequence of callback calls may be more complicated, of course. There may be blocks for multiple streamed transactions, some of the transactions may get aborted, etc.

与溢出到磁盘的行为类似,当从 WAL(对于所有正在进行的事务)解码的更改总量超过由 logical_decoding_work_mem 设置定义的限制时,将触发流传输。此时,将选择最大的顶级事务(按当前用于解码更改的内存量衡量)并进行流传输。但是,在某些情况下,即使启用了流传输,我们仍必须溢出到磁盘,因为我们超出了内存阈值,但仍未解码出完整的元组,例如,仅解码了 toast 表插入,但没有解码主表插入。

Similar to spill-to-disk behavior, streaming is triggered when the total amount of changes decoded from the WAL (for all in-progress transactions) exceeds the limit defined by logical_decoding_work_mem setting. At that point, the largest top-level transaction (measured by the amount of memory currently used for decoded changes) is selected and streamed. However, in some cases we still have to spill to disk even if streaming is enabled because we exceed the memory threshold but still have not decoded the complete tuple e.g., only decoded toast table insert but not the main table insert.

即使流传输大型事务,更改仍按提交顺序应用,与非流传输模式保留相同的保证。

Even when streaming large transactions, the changes are still applied in commit order, preserving the same guarantees as the non-streaming mode.