Postgresql 中文操作指南
49.2. Logical Decoding Concepts #
49.2.1. Logical Decoding #
逻辑解码的过程是从数据库表的持久性更改中提取所有更改内容,将其转换成易于理解的一致的格式,而无需详细了解数据库的内部状态。
Logical decoding is the process of extracting all persistent changes to a database’s tables into a coherent, easy to understand format which can be interpreted without detailed knowledge of the database’s internal state.
在 PostgreSQL 中,通过将内容从 write-ahead log解码为特定于应用程序的形式(如元组或 SQL 语句流)来实现逻辑解码,后者描述存储级别的更改。
In PostgreSQL, logical decoding is implemented by decoding the contents of the write-ahead log, which describe changes on a storage level, into an application-specific form such as a stream of tuples or SQL statements.
49.2.2. Replication Slots #
在逻辑复制的上下文中,槽代表一系列更改,可以按照其在原始服务器上进行的顺序重放给客户端。每个槽将从单个数据库串流一系列更改。
In the context of logical replication, a slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server. Each slot streams a sequence of changes from a single database.
Note
PostgreSQL 也具有流复制槽(请参见 Section 27.2.5),但它们在那里以某种不同的方式使用。
PostgreSQL also has streaming replication slots (see Section 27.2.5), but they are used somewhat differently there.
复制槽具有一个在PostgreSQL集群中所有数据库中唯一的标识符。槽独立于使用它们的连接,并且是防崩溃的。
A replication slot has an identifier that is unique across all databases in a PostgreSQL cluster. Slots persist independently of the connection using them and are crash-safe.
在正常操作中,逻辑槽只发送每个变更一次。每个槽的当前位置只在检查点时持久化,因此在发生崩溃时,该槽可能会返回到较早的 LSN,这会导致服务器重新启动时再次发送最近的变更。逻辑解码客户端负责避免因处理同一消息多次而产生的不良影响。当解码时,客户端可能会希望记录最后一次看到的 LSN,并跳过任何重复的数据,或者(在使用复制协议时)请求从该 LSN 开始解码,而不是让服务器确定起始点。复制进度跟踪功能是为此目的设计的,请参阅 replication origins。
A logical slot will emit each change just once in normal operation. The current position of each slot is persisted only at checkpoint, so in the case of a crash the slot may return to an earlier LSN, which will then cause recent changes to be sent again when the server restarts. Logical decoding clients are responsible for avoiding ill effects from handling the same message more than once. Clients may wish to record the last LSN they saw when decoding and skip over any repeated data or (when using the replication protocol) request that decoding start from that LSN rather than letting the server determine the start point. The Replication Progress Tracking feature is designed for this purpose, refer to replication origins.
对于单个数据库,可以存在多个独立的槽。每个槽都有自己的状态,允许不同的使用者从数据库更改流的不同点接收更改。对于大多数应用程序,每个使用者都需要一个单独的槽。
Multiple independent slots may exist for a single database. Each slot has its own state, allowing different consumers to receive changes from different points in the database change stream. For most applications, a separate slot will be required for each consumer.
逻辑复制槽不知道接收器状态。在不同的时间,甚至可能有多个不同的接收器使用同一个槽;他们只会从上一个接收器停止使用它们的时间开始接收更改。在任何给定时间,只允许一个接收器从槽使用更改。
A logical replication slot knows nothing about the state of the receiver(s). It’s even possible to have multiple different receivers using the same slot at different times; they’ll just get the changes following on from when the last receiver stopped consuming them. Only one receiver may consume changes from a slot at any given time.
逻辑复制槽也可以在热备用上创建。为了防止_VACUUM_从系统目录中删除所需的列,应该在备用上设置_hot_standby_feedback_。尽管如此,如果删除了任何需要的列,槽将失效。强烈建议在主系统和备用系统之间使用物理槽。否则,hot_standby_feedback_将起作用,但仅在连接处于活动状态时起作用(例如,节点重新启动将中断连接)。然后,主系统可能会删除备用系统上的逻辑解码可能需要的系统目录行(因为它不知道备用系统上的catalog_xmin)。如果主系统上的_wal_level_减小到小于_logical,备用系统上的现有逻辑槽也会失效。一旦备用系统在WAL流中检测到此类更改,将立即执行此操作。这意味着对于滞后的walsender(如果有的话),一些WAL记录(直到主系统上的_wal_level_参数发生更改)将不会被解码。
A logical replication slot can also be created on a hot standby. To prevent VACUUM from removing required rows from the system catalogs, hot_standby_feedback should be set on the standby. In spite of that, if any required rows get removed, the slot gets invalidated. It’s highly recommended to use a physical slot between the primary and the standby. Otherwise, hot_standby_feedback will work but only while the connection is alive (for example a node restart would break it). Then, the primary may delete system catalog rows that could be needed by the logical decoding on the standby (as it does not know about the catalog_xmin on the standby). Existing logical slots on standby also get invalidated if wal_level on the primary is reduced to less than logical. This is done as soon as the standby detects such a change in the WAL stream. It means that, for walsenders which are lagging (if any), some WAL records up to the wal_level parameter change on the primary won’t be decoded.
创建逻辑槽需要有关所有当前正在运行的事务的信息。在主系统上,可以直接获得此信息,但在备用系统上,必须从主系统获取此信息。因此,槽创建可能需要等待主系统上发生一些活动。如果主系统处于空闲状态,在备用系统上创建逻辑槽可能需要明显的时间。可以在主系统上调用_pg_log_standby_snapshot_函数来加快此过程。
Creation of a logical slot requires information about all the currently running transactions. On the primary, this information is available directly, but on a standby, this information has to be obtained from primary. Thus, slot creation may need to wait for some activity to happen on the primary. If the primary is idle, creating a logical slot on standby may take noticeable time. This can be sped up by calling the pg_log_standby_snapshot function on the primary.
Caution
复制槽在崩溃中持久化,而对其使用者一无所知。即使没有连接使用它们,它们也会防止删除所需资源。这会消耗存储空间,因为只要复制槽需要,VACUUM 就不能删除所需 WAL 或系统目录中的所需行。在极端情况下,这可能会导致数据库关闭以防止事务 ID 环绕(请参阅 Section 25.1.5)。因此,如果不再需要某个槽,则应将其删除。
Replication slots persist across crashes and know nothing about the state of their consumer(s). They will prevent removal of required resources even when there is no connection using them. This consumes storage because neither required WAL nor required rows from the system catalogs can be removed by VACUUM as long as they are required by a replication slot. In extreme cases this could cause the database to shut down to prevent transaction ID wraparound (see Section 25.1.5). So if a slot is no longer required it should be dropped.
49.2.3. Output Plugins #
输出插件将预写日志的内部表示中的数据转换成复制槽使用者希望的格式。
Output plugins transform the data from the write-ahead log’s internal representation into the format the consumer of a replication slot desires.
49.2.4. Exported Snapshots #
利用流复制界面(参见 CREATE_REPLICATION_SLOT )创建新复制槽时,将会导出一个快照(参见 Section 9.27.5 ),该快照将精确显示数据库的状态,所有更改都将包含在此后更改流之中。可利用 SET TRANSACTION SNAPSHOT 读取创建该槽时数据库的状态,借此创建新副本。此事务随后可用来将数据库状态转储到该时间点,而之后则可以使用该槽的内容更新该状态,而不会丢失任何更改。
When a new replication slot is created using the streaming replication interface (see CREATE_REPLICATION_SLOT), a snapshot is exported (see Section 9.27.5), which will show exactly the state of the database after which all changes will be included in the change stream. This can be used to create a new replica by using SET TRANSACTION SNAPSHOT to read the state of the database at the moment the slot was created. This transaction can then be used to dump the database’s state at that point in time, which afterwards can be updated using the slot’s contents without losing any changes.
创建快照并不总是可行的。特别是,当连接到热备用时,它将失败。不需要导出快照的应用程序可以使用_NOEXPORT_SNAPSHOT_选项来禁止导出快照。
Creation of a snapshot is not always possible. In particular, it will fail when connected to a hot standby. Applications that do not require snapshot export may suppress it with the NOEXPORT_SNAPSHOT option.