Postgresql 中文操作指南
13.4. Data Consistency Checks at the Application Level #
使用读提交事务很难以强制执行有关数据完整性的业务规则,因为每次语句执行时,数据视图都会发生转换,即使发生写冲突,单个语句也可能不会局限于语句快照。
It is very difficult to enforce business rules regarding data integrity using Read Committed transactions because the view of the data is shifting with each statement, and even a single statement may not restrict itself to the statement’s snapshot if a write conflict occurs.
尽管可重复读事务在其整个执行过程中都具有稳定数据视图,但使用MVCC快照来进行数据一致性检查时存在一个微小的问题,其中与_read/write conflicts_中的某些事物有关。如果一个事务写入数据,并发事务尝试读取相同数据(在写入之前或之后),它无法看到其他事务的工作。然后读者似乎先执行了,无论哪个先开始,哪个先提交。如果仅此而已,则没问题,但如果读者还写入并发事务读取的数据,则现在有一个事务似乎在前面提到的任何一个事务之前运行。如果似乎最后执行的事务实际上先提交,那么在事务执行顺序的图中很容易出现循环。当出现这样的循环时,如果不采取一些帮助措施,则完整性检查将无法正常工作。
While a Repeatable Read transaction has a stable view of the data throughout its execution, there is a subtle issue with using MVCC snapshots for data consistency checks, involving something known as read/write conflicts. If one transaction writes data and a concurrent transaction attempts to read the same data (whether before or after the write), it cannot see the work of the other transaction. The reader then appears to have executed first regardless of which started first or which committed first. If that is as far as it goes, there is no problem, but if the reader also writes data which is read by a concurrent transaction there is now a transaction which appears to have run before either of the previously mentioned transactions. If the transaction which appears to have executed last actually commits first, it is very easy for a cycle to appear in a graph of the order of execution of the transactions. When such a cycle appears, integrity checks will not work correctly without some help.
如 Section 13.2.3中所述,可串行化事务只是可重复读取事务,增加了对危险模式的读/写冲突的非阻塞监控。当检测到可能导致执行明显顺序形成循环的模式时,将回滚所涉及的事务之一以打破循环。
As mentioned in Section 13.2.3, Serializable transactions are just Repeatable Read transactions which add nonblocking monitoring for dangerous patterns of read/write conflicts. When a pattern is detected which could cause a cycle in the apparent order of execution, one of the transactions involved is rolled back to break the cycle.
13.4.1. Enforcing Consistency with Serializable Transactions #
如果对所有写入和所有需要一致数据视图的读取使用可序列化事务隔离级别,则不需要进行其他努力就能确保一致性。从其他环境中编写的软件如果使用可序列化事务来确保一致性,则在PostgreSQL中应在这方面“正常工作”。
If the Serializable transaction isolation level is used for all writes and for all reads which need a consistent view of the data, no other effort is required to ensure consistency. Software from other environments which is written to use serializable transactions to ensure consistency should “just work” in this regard in PostgreSQL.
当使用此技术时,如果应用软件通过自动重试因序列化失败而回滚的事务的框架,将避免给应用程序员带来不必要的负担。将_default_transaction_isolation_设置为_serializable_可能是一个好主意。通过在触发器中检查事务隔离级别,采取一些措施来确保没有使用其他事务隔离级别,无论是出于无意还是为了破坏完整性检查,这也很明智。
When using this technique, it will avoid creating an unnecessary burden for application programmers if the application software goes through a framework which automatically retries transactions which are rolled back with a serialization failure. It may be a good idea to set default_transaction_isolation to serializable. It would also be wise to take some action to ensure that no other transaction isolation level is used, either inadvertently or to subvert integrity checks, through checks of the transaction isolation level in triggers.
请参见 Section 13.2.3以获取性能建议。
See Section 13.2.3 for performance suggestions.
Warning: Serializable Transactions and Data Replication
使用可串行化事务的这种级别的完整性保护尚未扩展到热备用模式( Section 27.4)或逻辑副本。因此,使用热备用或逻辑复制的人可能希望在主服务器上使用可重复读取和显式锁定。
This level of integrity protection using Serializable transactions does not yet extend to hot standby mode (Section 27.4) or logical replicas. Because of that, those using hot standby or logical replication may want to use Repeatable Read and explicit locking on the primary.
13.4.2. Enforcing Consistency with Explicit Blocking Locks #
当不可序列化写操作可能发生时,为了确保行的当前有效性并防止同时更新,必须使用 SELECT FOR UPDATE、SELECT FOR SHARE 或适当的 LOCK TABLE 语句。(SELECT FOR UPDATE 和 SELECT FOR SHARE 只锁定返回行以防止同时更新,而 LOCK TABLE 锁定整个表。)在将应用程序从其他环境移植到 PostgreSQL 中时,应该考虑到这一点。
When non-serializable writes are possible, to ensure the current validity of a row and protect it against concurrent updates one must use SELECT FOR UPDATE, SELECT FOR SHARE, or an appropriate LOCK TABLE statement. (SELECT FOR UPDATE and SELECT FOR SHARE lock just the returned rows against concurrent updates, while LOCK TABLE locks the whole table.) This should be taken into account when porting applications to PostgreSQL from other environments.
对于从其他环境转换的人来说,另一件需要注意的事情是 SELECT FOR UPDATE 不能确保并发事务不会更新或删除选定行。要在 PostgreSQL 中执行此操作,必须实际更新该行,即使不需要更改任何值。SELECT FOR UPDATE temporarily blocks 其他事务获取相同的锁或执行 UPDATE 或 DELETE(这会影响锁定的行),但一旦持有此锁的事务提交或回滚,则阻塞事务将继续执行冲突操作,除非在持有锁时实际对行 UPDATE。
Also of note to those converting from other environments is the fact that SELECT FOR UPDATE does not ensure that a concurrent transaction will not update or delete a selected row. To do that in PostgreSQL you must actually update the row, even if no values need to be changed. SELECT FOR UPDATE temporarily blocks other transactions from acquiring the same lock or executing an UPDATE or DELETE which would affect the locked row, but once the transaction holding this lock commits or rolls back, a blocked transaction will proceed with the conflicting operation unless an actual UPDATE of the row was performed while the lock was held.
在不可序列化的 MVCC 下,全局有效性检查需要额外的考虑。例如,当两个表同时积极更新时,银行应用程序可能希望检查一个表中所有贷方的总金额是否等于另一个表中所有借方的总金额。在 Read Committed 模式下,比较两个连续的 SELECT sum(…) 命令的结果将不可靠,因为第二个查询很可能包含第一个查询未计算的事务的结果。在单个可重复读事务中执行这两个总计将准确描述只在可重复读事务开始之前提交的事务的影响——但人们可能合理地想知道,在传递答案时,答案是否仍然相关。如果可重复读事务本身在尝试进行一致性检查之前应用了一些更改,则检查的有效性将更具争议性,因为现在它包括了一部分但并不是所有事务开始后的更改。在这种情况下,谨慎的人可能希望锁定检查所需的所有表,以获得当前现实的无可争辩的描述。一个 SHARE 模式(或更高版本)锁定可以保证锁定表中没有未提交的更改,除了当前事务的更改。
Global validity checks require extra thought under non-serializable MVCC. For example, a banking application might wish to check that the sum of all credits in one table equals the sum of debits in another table, when both tables are being actively updated. Comparing the results of two successive SELECT sum(…) commands will not work reliably in Read Committed mode, since the second query will likely include the results of transactions not counted by the first. Doing the two sums in a single repeatable read transaction will give an accurate picture of only the effects of transactions that committed before the repeatable read transaction started — but one might legitimately wonder whether the answer is still relevant by the time it is delivered. If the repeatable read transaction itself applied some changes before trying to make the consistency check, the usefulness of the check becomes even more debatable, since now it includes some but not all post-transaction-start changes. In such cases a careful person might wish to lock all tables needed for the check, in order to get an indisputable picture of current reality. A SHARE mode (or higher) lock guarantees that there are no uncommitted changes in the locked table, other than those of the current transaction.
另请注意,如果依赖显式锁定来防止并发更改,则应使用 Read Committed 模式,或者在可重复读模式下,在执行查询之前小心获取锁定。由可重复读事务获取的锁可确保没有修改表的其他事务仍在运行,但如果事务看到的快照早于获取锁,则该快照可能早于表中一些现已提交的更改。可重复读事务的快照实际上是在其第一个查询或数据修改命令的开始处冻结的(SELECT、INSERT、UPDATE、DELETE 或 MERGE),因此可以在快照冻结之前显式获取锁定。
Note also that if one is relying on explicit locking to prevent concurrent changes, one should either use Read Committed mode, or in Repeatable Read mode be careful to obtain locks before performing queries. A lock obtained by a repeatable read transaction guarantees that no other transactions modifying the table are still running, but if the snapshot seen by the transaction predates obtaining the lock, it might predate some now-committed changes in the table. A repeatable read transaction’s snapshot is actually frozen at the start of its first query or data-modification command (SELECT, INSERT, UPDATE, DELETE, or MERGE), so it is possible to obtain locks explicitly before the snapshot is frozen.