Postgresql 中文操作指南

25.1. Routine Vacuuming #

PostgreSQL 数据库需要定期维护,称为 vacuuming 。对于许多安装,可以让自动真空功能由 autovacuum daemon 执行,该功能在 Section 25.1.6 中进行了描述。您可能需要调整此处描述的自动真空参数,以获得最佳结果。一些数据库管理员会希望使用手动管理的 VACUUM 命令来补充或替换守护程序的活动,这些命令通常由 cron 或任务计划程序脚本按计划执行。要正确设置手动管理的真空功能,了解在接下来的几个小节中讨论的问题非常重要。依赖自动真空的管理员可能仍然希望浏览此材料,以帮助他们理解和调整自动真空。

PostgreSQL databases require periodic maintenance known as vacuuming. For many installations, it is sufficient to let vacuuming be performed by the autovacuum daemon, which is described in Section 25.1.6. You might need to adjust the autovacuuming parameters described there to obtain best results for your situation. Some database administrators will want to supplement or replace the daemon’s activities with manually-managed VACUUM commands, which typically are executed according to a schedule by cron or Task Scheduler scripts. To set up manually-managed vacuuming properly, it is essential to understand the issues discussed in the next few subsections. Administrators who rely on autovacuuming may still wish to skim this material to help them understand and adjust autovacuuming.

25.1.1. Vacuuming Basics #

由于以下几个原因,PostgreSQL 的 VACUUM 命令必须定期处理每个表:

PostgreSQL’s VACUUM command has to process each table on a regular basis for several reasons:

由于每个原因都会指示执行频率和范围各不相同的 VACUUM 操作,因此在下文中进行了说明。

Each of these reasons dictates performing VACUUM operations of varying frequency and scope, as explained in the following subsections.

有两种 VACUUM 变体:标准 VACUUMVACUUM FULLVACUUM FULL 可以回收更多磁盘空间,但运行速度要慢得多。此外,标准形式的 VACUUM 可以与生成数据库操作并行运行。(如 SELECTINSERTUPDATEDELETE 等命令将继续正常运行,不过,当表正在被吸尘时,您将无法使用如 ALTER TABLE 等命令修改该表的定义。)VACUUM FULL 要求对它正在处理的表进行 ACCESS EXCLUSIVE 锁定,因此不能与其他对表的使用的并行完成。因此,一般来说,管理员应努力使用标准 VACUUM 并避免 VACUUM FULL

There are two variants of VACUUM: standard VACUUM and VACUUM FULL. VACUUM FULL can reclaim more disk space but runs much more slowly. Also, the standard form of VACUUM can run in parallel with production database operations. (Commands such as SELECT, INSERT, UPDATE, and DELETE will continue to function normally, though you will not be able to modify the definition of a table with commands such as ALTER TABLE while it is being vacuumed.) VACUUM FULL requires an ACCESS EXCLUSIVE lock on the table it is working on, and therefore cannot be done in parallel with other use of the table. Generally, therefore, administrators should strive to use standard VACUUM and avoid VACUUM FULL.

VACUUM 会产生大量的 I/O 流量,这可能会导致其他活动会话的性能不佳。可以调整配置参数以减少后台真空的性能影响,请参阅 Section 20.4.4

VACUUM creates a substantial amount of I/O traffic, which can cause poor performance for other active sessions. There are configuration parameters that can be adjusted to reduce the performance impact of background vacuuming — see Section 20.4.4.

25.1.2. Recovering Disk Space #

在 PostgreSQL 中,一行数据的 UPDATEDELETE 不会立即删除行的旧版本。采用这种方法是为了获取多版本并发控制 (MVCC) 的好处(请参阅 Chapter 13 ):在其他事务可能仍能看到行版本的情况下,不能删除它。但最终,过时的或已删除的行版本不再对任何事务有用。它所占的空间必须回收,以便新行重复使用,以避免磁盘空间需求无限增长。这是通过运行 VACUUM 来完成的。

In PostgreSQL, an UPDATE or DELETE of a row does not immediately remove the old version of the row. This approach is necessary to gain the benefits of multiversion concurrency control (MVCC, see Chapter 13): the row version must not be deleted while it is still potentially visible to other transactions. But eventually, an outdated or deleted row version is no longer of interest to any transaction. The space it occupies must then be reclaimed for reuse by new rows, to avoid unbounded growth of disk space requirements. This is done by running VACUUM.

VACUUM 的标准形式移除表和索引中的无效行版本,并标记可供未来重用的空间。但是,除了在表末尾一个或多个页面完全变为空并且可以轻松获得一个独占表锁的特例之外,它不会将此空间返还给操作系统。相反,VACUUM FULL 将表文件的一个完整新版本用没有无效空间的形式写入,从而主动压缩表。这最大程度地减少了表的大小,但可能需要很长时间。它还需要额外的磁盘空间用于表的副本,直到操作完成。

The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained. In contrast, VACUUM FULL actively compacts tables by writing a complete new version of the table file with no dead space. This minimizes the size of the table, but can take a long time. It also requires extra disk space for the new copy of the table, until the operation completes.

例行 vacuum 的通常目标是执行标准 VACUUM_s often enough to avoid needing _VACUUM FULL。autovacuum 守护程序尝试按此方法运行,事实上它永远不会发出 VACUUM FULL。在该方法中,想法不是让表保持最小大小,而是保持磁盘空间的稳定状态使用:每张表占用的空间等于其最小大小,加上两次 vacuum 运行期间所用掉的任何空间。尽管 VACUUM FULL 可用于将表收缩回其最小大小并将磁盘空间返还给操作系统,如果该表将来会再次增长,那么这样做就没有多大意义。因此,对频繁更新的表,频率适中的标准 VACUUM 运行比不频繁的 VACUUM FULL 运行更好的方法是,保持一切井然有序。

The usual goal of routine vacuuming is to do standard VACUUM_s often enough to avoid needing _VACUUM FULL. The autovacuum daemon attempts to work this way, and in fact will never issue VACUUM FULL. In this approach, the idea is not to keep tables at their minimum size, but to maintain steady-state usage of disk space: each table occupies space equivalent to its minimum size plus however much space gets used up between vacuum runs. Although VACUUM FULL can be used to shrink a table back to its minimum size and return the disk space to the operating system, there is not much point in this if the table will just grow again in the future. Thus, moderately-frequent standard VACUUM runs are a better approach than infrequent VACUUM FULL runs for maintaining heavily-updated tables.

一些管理员更喜欢自己计划 vacuum,例如仅在夜间服务器负载低时执行全部工作。按照固定计划执行 vacuum 的困难之处在于,如果某张表在更新活动中出现意外高峰,那么它可能会膨胀到真的需要 VACUUM FULL 来回收空间的地步。使用 autovacuum 守护程序可缓解此问题,因为该守护程序会响应更新活动动态计划 vacuum。除非你的工作负载极易预测,否则禁用该守护程序是不明智的。一种可行的折衷方案是设置该守护程序的参数,以便它只对异常频繁的更新活动做出反应,从而避免失控,同时期待计划的 VACUUM 在负载典型的情况下执行大部分工作。

Some administrators prefer to schedule vacuuming themselves, for example doing all the work at night when load is low. The difficulty with doing vacuuming according to a fixed schedule is that if a table has an unexpected spike in update activity, it may get bloated to the point that VACUUM FULL is really necessary to reclaim space. Using the autovacuum daemon alleviates this problem, since the daemon schedules vacuuming dynamically in response to update activity. It is unwise to disable the daemon completely unless you have an extremely predictable workload. One possible compromise is to set the daemon’s parameters so that it will only react to unusually heavy update activity, thus keeping things from getting out of hand, while scheduled _VACUUM_s are expected to do the bulk of the work when the load is typical.

对于未使用自动清理的人来说,一种典型的方法是在使用率低的时候每天安排一次数据库范围的 VACUUM ,并在需要时对经常更新的表格进行更频繁的清理。(一些更新频率极高的安装会每隔几分钟就清理最繁忙的表格。)如果您在一个集群中有多个数据库,别忘了 VACUUM 每个数据库,程序 vacuumdb 可能有所帮助。

For those not using autovacuum, a typical approach is to schedule a database-wide VACUUM once a day during a low-usage period, supplemented by more frequent vacuuming of heavily-updated tables as necessary. (Some installations with extremely high update rates vacuum their busiest tables as often as once every few minutes.) If you have multiple databases in a cluster, don’t forget to VACUUM each one; the program vacuumdb might be helpful.

Tip

当一张表在大量的更新或删除活动中包含大量的死行版本时,简单的 VACUUM 可能会不尽人意。如果您有这样的表格并且需要回收它所占用的额外磁盘空间,您将需要使用 VACUUM FULL ,或者使用 CLUSTERALTER TABLE 的表格重写变体之一。这些命令重写表格的全新副本并为其建立新的索引。所有这些选项都需要 ACCESS EXCLUSIVE 锁定。请注意,它们还会临时使用额外的磁盘空间,大约等于表格的大小,因为在新的副本完成之前不能释放表格和索引的旧副本。

Plain VACUUM may not be satisfactory when a table contains large numbers of dead row versions as a result of massive update or delete activity. If you have such a table and you need to reclaim the excess disk space it occupies, you will need to use VACUUM FULL, or alternatively CLUSTER or one of the table-rewriting variants of ALTER TABLE. These commands rewrite an entire new copy of the table and build new indexes for it. All these options require an ACCESS EXCLUSIVE lock. Note that they also temporarily use extra disk space approximately equal to the size of the table, since the old copies of the table and indexes can’t be released until the new ones are complete.

Tip

如果您有一张表,它的全部内容会定期被删除,请考虑使用 TRUNCATE 来执行此操作,而不是使用 DELETE 后跟 VACUUMTRUNCATE 立即移除表格的全部内容,而不需要后续的 VACUUMVACUUM FULL 来回收现在未使用的磁盘空间。缺点是违反了严格的 MVCC 语义。

If you have a table whose entire contents are deleted on a periodic basis, consider doing it with TRUNCATE rather than using DELETE followed by VACUUM. TRUNCATE removes the entire content of the table immediately, without requiring a subsequent VACUUM or VACUUM FULL to reclaim the now-unused disk space. The disadvantage is that strict MVCC semantics are violated.

25.1.3. Updating Planner Statistics #

PostgreSQL 查询规划器依赖于有关表格内容的统计信息,以便为查询生成良好的计划。这些统计信息由 ANALYZE 命令收集,它可以自己调用或作为 VACUUM 中的一个可选步骤。拥有相当准确的统计信息很重要,否则糟糕的计划选择可能会降低数据库性能。

The PostgreSQL query planner relies on statistical information about the contents of tables in order to generate good plans for queries. These statistics are gathered by the ANALYZE command, which can be invoked by itself or as an optional step in VACUUM. It is important to have reasonably accurate statistics, otherwise poor choices of plans might degrade database performance.

如果启用了 autovacuum 守护程序,它将在表内容充分更改时自动发出 ANALYZE 命令。但是,管理员可能会更喜欢依靠手动计划的 ANALYZE 操作,特别是如果知道表上的更新活动不会影响“有趣”列的统计信息时。该守护程序将 ANALYZE 严格计划为插入或更新的行数量的一个函数;它不知道是否会导致有意义的统计信息更改。

The autovacuum daemon, if enabled, will automatically issue ANALYZE commands whenever the content of a table has changed sufficiently. However, administrators might prefer to rely on manually-scheduled ANALYZE operations, particularly if it is known that update activity on a table will not affect the statistics of “interesting” columns. The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes.

分区和继承子对象中更改的元组不会在父表上触发分析。如果父表为空或极少更改,则 autovacuum 可能永远不会对其进行处理,并且整个继承树的统计信息将不会被收集。需要手动在父表上运行 ANALYZE 以保持统计信息是最新的。

Tuples changed in partitions and inheritance children do not trigger analyze on the parent table. If the parent table is empty or rarely changed, it may never be processed by autovacuum, and the statistics for the inheritance tree as a whole won’t be collected. It is necessary to run ANALYZE on the parent table manually in order to keep the statistics up to date.

与用于回收空间的 vacuum 相似,频繁更新统计信息对于频繁更新的表比对于很少更新的表更有用。但即使对于频繁更新的表,如果数据的统计分布没有太大变化,那么也可能不需要更新统计信息。一个简单的经验法则是,考虑表中最小值和最大值发生多少变化。例如,包含行更新时间的 timestamp 列随着行添加和更新而具有不断增加的最大值;这样的列可能会比方说包含网站上访问的页面的 URL 的列更需要频繁的统计信息更新。该 URL 列可能与更改一样频繁,但其值中的统计分布可能会相对较慢地变化。

As with vacuuming for space recovery, frequent updates of statistics are more useful for heavily-updated tables than for seldom-updated ones. But even for a heavily-updated table, there might be no need for statistics updates if the statistical distribution of the data is not changing much. A simple rule of thumb is to think about how much the minimum and maximum values of the columns in the table change. For example, a timestamp column that contains the time of row update will have a constantly-increasing maximum value as rows are added and updated; such a column will probably need more frequent statistics updates than, say, a column containing URLs for pages accessed on a website. The URL column might receive changes just as often, but the statistical distribution of its values probably changes relatively slowly.

可以在特定表,甚至表中的特定列上运行 ANALYZE,因此如果你的应用程序需要,可以更频繁地更新一些统计信息,而不是其他统计信息。然而,在实践中,通常最好只分析整个数据库,因为它是一个快速的操作。ANALYZE 使用统计随机采样表的行,而不是读取每一行。

It is possible to run ANALYZE on specific tables and even just specific columns of a table, so the flexibility exists to update some statistics more frequently than others if your application requires it. In practice, however, it is usually best to just analyze the entire database, because it is a fast operation. ANALYZE uses a statistically random sampling of the rows of a table rather than reading every single row.

Tip

尽管 per-column 对 ANALYZE 的频率进行微调可能不会非常有成效,但您可能会认为值得对 ANALYZE 收集的统计数据进行 per-column 调整。在 WHERE 中大量使用并具有高度不规则数据分布的列可能需要比其他列更精细粒度的粒度数据直方图。请参阅 ALTER TABLE SET STATISTICS ,或使用 default_statistics_target 配置参数更改整个数据库的默认值。

Although per-column tweaking of ANALYZE frequency might not be very productive, you might find it worthwhile to do per-column adjustment of the level of detail of the statistics collected by ANALYZE. Columns that are heavily used in WHERE clauses and have highly irregular data distributions might require a finer-grain data histogram than other columns. See ALTER TABLE SET STATISTICS, or change the database-wide default using the default_statistics_target configuration parameter.

此外,关于函数的选择性默认提供的信息有限。但是,如果你创建一个使用函数调用的统计对象或表达式索引,那么将收集一些关于该函数的有用统计信息,这可以极大改进使用该表达式索引的查询计划。

Also, by default there is limited information available about the selectivity of functions. However, if you create a statistics object or an expression index that uses a function call, useful statistics will be gathered about the function, which can greatly improve query plans that use the expression index.

Tip

autovacuum 守护程序不会为外部表发出 ANALYZE 命令,因为它没有方法来确定这样做可能有多么有益。如果你的查询需要外部表上的统计信息进行正确的规划,那么最好在那些表上按适当时间表手动管理 ANALYZE 命令。

The autovacuum daemon does not issue ANALYZE commands for foreign tables, since it has no means of determining how often that might be useful. If your queries require statistics on foreign tables for proper planning, it’s a good idea to run manually-managed ANALYZE commands on those tables on a suitable schedule.

Tip

autovacuum 守护程序不会为分区表发出 ANALYZE 命令。继承父对象仅在父对象本身被更改后才会被分析 - 子表的更改不会触发父表上的自动分析。如果你的查询需要父表上的统计信息进行正确的规划,那么周期性地在这些表上手动运行 ANALYZE 以保持统计信息是最新的非常有必要。

The autovacuum daemon does not issue ANALYZE commands for partitioned tables. Inheritance parents will only be analyzed if the parent itself is changed - changes to child tables do not trigger autoanalyze on the parent table. If your queries require statistics on parent tables for proper planning, it is necessary to periodically run a manual ANALYZE on those tables to keep the statistics up to date.

25.1.4. Updating the Visibility Map #

Vacuum 为每个表维护一个 visibility map ,以跟踪仅包含被认为对所有活动事务(和所有未来事务)可见的元组的页面(直到该页面再次被修改)。这有两个目的。首先,Vacuum 本身可以在下一次运行时跳过此类页面,因为没有要清理的内容。

Vacuum maintains a visibility map for each table to keep track of which pages contain only tuples that are known to be visible to all active transactions (and all future transactions, until the page is again modified). This has two purposes. First, vacuum itself can skip such pages on the next run, since there is nothing to clean up.

其次,它允许 PostgreSQL 在不参考基础表格的情况下仅使用索引来回答一些查询。由于 PostgreSQL 索引不包含元组可见性信息,因此正常的索引扫描会为每个匹配的索引条目提取堆元组,以检查当前事务是否应该看到它。另一方面, index-only scan 首先检查可见性映射。如果已知页面上的所有元组都是可见的,则可以跳过堆提取。这在可见性映射可以防止磁盘访问的大型数据集上最为有用。可见性映射比堆小得多,因此即使堆非常大,也可以轻松将其缓存在内存中。

Second, it allows PostgreSQL to answer some queries using only the index, without reference to the underlying table. Since PostgreSQL indexes don’t contain tuple visibility information, a normal index scan fetches the heap tuple for each matching index entry, to check whether it should be seen by the current transaction. An index-only scan, on the other hand, checks the visibility map first. If it’s known that all tuples on the page are visible, the heap fetch can be skipped. This is most useful on large data sets where the visibility map can prevent disk accesses. The visibility map is vastly smaller than the heap, so it can easily be cached even when the heap is very large.

25.1.5. Preventing Transaction ID Wraparound Failures #

PostgreSQL 的 MVCC 事务语义依赖于能够比较事务 ID (XID) 号码:插入 XID 大于当前事务 XID 的行版本“在未来”,并且当前事务不应该可见。但由于事务 ID 的大小有限(32 位),因此长期运行(超过 40 亿个事务)的集群将承受 transaction ID wraparound :XID 计数器回绕到零,并且突然之间过去出现的事务似乎在未来发生,这意味着其输出变得不可见。简而言之,是灾难性的数据丢失。(实际上数据仍然存在,但如果你无法获取它,那就没有什么安慰了。) 为避免这种情况,有必要每隔 20 亿个事务至少对每个数据库中的每个表进行一次 vacuum。

PostgreSQL’s MVCC transaction semantics depend on being able to compare transaction ID (XID) numbers: a row version with an insertion XID greater than the current transaction’s XID is “in the future” and should not be visible to the current transaction. But since transaction IDs have limited size (32 bits) a cluster that runs for a long time (more than 4 billion transactions) would suffer transaction ID wraparound: the XID counter wraps around to zero, and all of a sudden transactions that were in the past appear to be in the future — which means their output become invisible. In short, catastrophic data loss. (Actually the data is still there, but that’s cold comfort if you cannot get at it.) To avoid this, it is necessary to vacuum every table in every database at least once every two billion transactions.

定期 vacuum 解决问题的原因为,VACUUM 将标记行为 frozen 的行,表明它们是由一个足够久之前已提交的事务插入的,这样插入事务的效果必定对所有当前和未来事务可见。使用模 232 算术对正常的 XID 进行比较。这意味着对于每个正常 XID,有两个十亿个 XID “较旧”,两个十亿个 XID “较新”;另一种说法是正常的 XID 空间是循环的,没有端点。因此,一旦使用特定的正常 XID 创建了一个行版本,那么无论我们讨论的是哪个 XID,该行版本对于接下来的两个十亿个事务而言都会显示为“过去时”。如果行版本在超过这两个十亿个事务后仍然存在,那么它会突然显示为未来时。为了防止这种情况,PostgreSQL 保留了一个特殊的 XID,FrozenTransactionId,它不遵循正常的 XID 比较规则,并且总是被认为比每个正常的 XID 都旧。冻结的行版本被视为插入的 XID 为 FrozenTransactionId,因此无论出现什么环绕问题,它们对于所有的正常事务来说都会显示为“过去时”,因此无论这样的行版本存在多久,它们都将保持有效。

The reason that periodic vacuuming solves the problem is that VACUUM will mark rows as frozen, indicating that they were inserted by a transaction that committed sufficiently far in the past that the effects of the inserting transaction are certain to be visible to all current and future transactions. Normal XIDs are compared using modulo-232 arithmetic. This means that for every normal XID, there are two billion XIDs that are “older” and two billion that are “newer”; another way to say it is that the normal XID space is circular with no endpoint. Therefore, once a row version has been created with a particular normal XID, the row version will appear to be “in the past” for the next two billion transactions, no matter which normal XID we are talking about. If the row version still exists after more than two billion transactions, it will suddenly appear to be in the future. To prevent this, PostgreSQL reserves a special XID, FrozenTransactionId, which does not follow the normal XID comparison rules and is always considered older than every normal XID. Frozen row versions are treated as if the inserting XID were FrozenTransactionId, so that they will appear to be “in the past” to all normal transactions regardless of wraparound issues, and so such row versions will be valid until deleted, no matter how long that is.

Note

在 9.4 之前的 PostgreSQL 版本中,实现冻结时实际上是将行的插入 XID 替换为 FrozenTransactionId,该 XID 在行的 xmin 系统列中可见。更新版本只设置一个标记位,保存行的原始 xmin 以便将来取证使用。但是,在从 9.4 之前的版本中升级而来的数据库中,仍然可能找到 xmin 等于 FrozenTransactionId (2) 的行。

In PostgreSQL versions before 9.4, freezing was implemented by actually replacing a row’s insertion XID with FrozenTransactionId, which was visible in the row’s xmin system column. Newer versions just set a flag bit, preserving the row’s original xmin for possible forensic use. However, rows with xmin equal to FrozenTransactionId (2) may still be found in databases pg_upgrade’d from pre-9.4 versions.

此外,系统目录可能包含 XID 等于 BootstrapTransactionId(1)的 xmin 行,表明它们是在 initdb 的第一阶段中插入的。与 FrozenTransactionId 类似,这个特殊的 XID 被视为比每个正常的 XID 都旧。

Also, system catalogs may contain rows with xmin equal to BootstrapTransactionId (1), indicating that they were inserted during the first phase of initdb. Like FrozenTransactionId, this special XID is treated as older than every normal XID.

vacuum_freeze_min_age 控制 XID 值在冻结带有该 XID 的行之前必须过时的程度。如果本来很快就要修改已冻结的行,则增加此设置可以避免不必要的工作,但减小此设置会增加表必须再次进行 vacuum 之前可以经过的事务数。

vacuum_freeze_min_age controls how old an XID value has to be before rows bearing that XID will be frozen. Increasing this setting may avoid unnecessary work if the rows that would otherwise be frozen will soon be modified again, but decreasing this setting increases the number of transactions that can elapse before the table must be vacuumed again.

VACUUM 使用 visibility map 来确定必须扫描表的哪些页面。通常,即使这些页面仍然具有具有旧 XID 值的行版本,它也会跳过没有死行版本的页面。因此,常规 VACUUM_s won’t always freeze every old row version in the table. When that happens, _VACUUM 最终需要执行 aggressive vacuum ,它将冻结所有合格的未冻结 XID 和 MXID 值,包括所有可见但未完全冻结的页面的值。实际上,大多数表都需要定期进行激进的 vacuum。 vacuum_freeze_table_age 控制 VACUUM 进行此操作的时间:如果自上次此类扫描以来经过的事务数大于 vacuum_freeze_table_age 减去 vacuum_freeze_min_age ,则扫描所有可见但未完全冻结的页面。将 vacuum_freeze_table_age 设置为 0 会强制 VACUUM 始终使用其激进策略。

VACUUM uses the visibility map to determine which pages of a table must be scanned. Normally, it will skip pages that don’t have any dead row versions even if those pages might still have row versions with old XID values. Therefore, normal VACUUM_s won’t always freeze every old row version in the table. When that happens, _VACUUM will eventually need to perform an aggressive vacuum, which will freeze all eligible unfrozen XID and MXID values, including those from all-visible but not all-frozen pages. In practice most tables require periodic aggressive vacuuming. vacuum_freeze_table_age controls when VACUUM does that: all-visible but not all-frozen pages are scanned if the number of transactions that have passed since the last such scan is greater than vacuum_freeze_table_age minus vacuum_freeze_min_age. Setting vacuum_freeze_table_age to 0 forces VACUUM to always use its aggressive strategy.

未进行 vacuum 的表的最长时间是 20 亿个事务减去上次激进 vacuum 时的 vacuum_freeze_min_age 值。如果超过此时间未进行 vacuum,则可能导致数据丢失。为了确保这种情况不会发生,将在任何可能包含 XID 早于配置参数 autovacuum_freeze_max_age 指定的年龄的未冻结行的表上调用自动 vacuum。(即使禁用自动 vacuum,也会发生这种情况。)

The maximum time that a table can go unvacuumed is two billion transactions minus the vacuum_freeze_min_age value at the time of the last aggressive vacuum. If it were to go unvacuumed for longer than that, data loss could result. To ensure that this does not happen, autovacuum is invoked on any table that might contain unfrozen rows with XIDs older than the age specified by the configuration parameter autovacuum_freeze_max_age. (This will happen even if autovacuum is disabled.)

这意味着,如果一张表没有被清理,则大约每隔 autovacuum_freeze_max_age 减去 vacuum_freeze_min_age 个事务会在其上调用自动清理。对于为了回收空间目的而定期清理的表,这问题不大。然而,对于静态表(包括接收插入但不更新或删除的表)来说,不必为了回收空间而清理,因此尝试在非常大的静态表上使强制自动清理的间隔时间最大化是有用的。很明显,你可以通过增 autovacuum_freeze_max_age 或减小 vacuum_freeze_min_age 来达到这一目的。

This implies that if a table is not otherwise vacuumed, autovacuum will be invoked on it approximately once every autovacuum_freeze_max_age minus vacuum_freeze_min_age transactions. For tables that are regularly vacuumed for space reclamation purposes, this is of little importance. However, for static tables (including tables that receive inserts, but no updates or deletes), there is no need to vacuum for space reclamation, so it can be useful to try to maximize the interval between forced autovacuums on very large static tables. Obviously one can do this either by increasing autovacuum_freeze_max_age or decreasing vacuum_freeze_min_age.

vacuum_freeze_table_age 的有效最大值为 0.95 * autovacuum_freeze_max_age;高于此设置的值将被限制为最大值。大于 autovacuum_freeze_max_age 的值没有意义,因为此时将触发反早绕自动清理,而 0.95 乘数会留出一些空间在这种情况发生之前运行手动 VACUUM。作为经验法则,vacuum_freeze_table_age 应设置为远小于 autovacuum_freeze_max_age 的值,留出足够空间,以便在该窗口中运行定期计划的 VACUUM 或由正常的删除和更新活动触发的自动清理。尽管该表近期才被清理过以回收空间,但过早地设置它可能会导致反早绕自动清理,而更低的值会导致更频繁的深度清理。

The effective maximum for vacuum_freeze_table_age is 0.95 * autovacuum_freeze_max_age; a setting higher than that will be capped to the maximum. A value higher than autovacuum_freeze_max_age wouldn’t make sense because an anti-wraparound autovacuum would be triggered at that point anyway, and the 0.95 multiplier leaves some breathing room to run a manual VACUUM before that happens. As a rule of thumb, vacuum_freeze_table_age should be set to a value somewhat below autovacuum_freeze_max_age, leaving enough gap so that a regularly scheduled VACUUM or an autovacuum triggered by normal delete and update activity is run in that window. Setting it too close could lead to anti-wraparound autovacuums, even though the table was recently vacuumed to reclaim space, whereas lower values lead to more frequent aggressive vacuuming.

增加 autovacuum_freeze_max_age(以及 vacuum_freeze_table_age)的唯一缺点是,数据库集群的 pg_xactpg_commit_ts 子目录将占用更多空间,因为它必须存储提交状态和回到 autovacuum_freeze_max_age 水平的所有事务的时间戳(如果启用了 track_commit_timestamp)。提交状态使用每个事务 2 位,因此如果 autovacuum_freeze_max_age 设置为其允许的最大值 20 亿,则 pg_xact 预计将会增长到大约 500MB,pg_commit_ts 预计将会增长到大约 20GB。如果与你的总数据库大小相比这微不足道,则建议将 autovacuum_freeze_max_age 设置为其允许的最大值。否则,根据你愿意 pg_xactpg_commit_ts 存储多少来设置它。(默认值 2 亿个事务转换为大约 50MB 的 pg_xact 存储和大约 2GB 的 pg_commit_ts 存储。)

The sole disadvantage of increasing autovacuum_freeze_max_age (and vacuum_freeze_table_age along with it) is that the pg_xact and pg_commit_ts subdirectories of the database cluster will take more space, because it must store the commit status and (if track_commit_timestamp is enabled) timestamp of all transactions back to the autovacuum_freeze_max_age horizon. The commit status uses two bits per transaction, so if autovacuum_freeze_max_age is set to its maximum allowed value of two billion, pg_xact can be expected to grow to about half a gigabyte and pg_commit_ts to about 20GB. If this is trivial compared to your total database size, setting autovacuum_freeze_max_age to its maximum allowed value is recommended. Otherwise, set it depending on what you are willing to allow for pg_xact and pg_commit_ts storage. (The default, 200 million transactions, translates to about 50MB of pg_xact storage and about 2GB of pg_commit_ts storage.)

减小 vacuum_freeze_min_age 的一个缺点是它可能会导致 VACUUM 执行无用操作:如果行很快就进行了修改(导致它获得了新的 XID),那么冻结行版本便是浪费时间。因此,应将设置设足够大,以便行不太可能再改变后再冻结它们。

One disadvantage of decreasing vacuum_freeze_min_age is that it might cause VACUUM to do useless work: freezing a row version is a waste of time if the row is modified soon thereafter (causing it to acquire a new XID). So the setting should be large enough that rows are not frozen until they are unlikely to change any more.

为了跟踪数据库中最旧的未冻结 XID 的期限,VACUUM 将 XID 统计信息存储在系统表 pg_classpg_database 中。特别是,一张表 pg_class 行的 relfrozenxid 列包含最近一次已成功推进 relfrozenxid(通常是最近一次的深度 VACUUM)的 VACUUM 末尾处剩余的最旧未冻结 XID。类似地,一个数据库 pg_database 行的 datfrozenxid 列是对该数据库中出现的未冻结 XID 的下限,它只是数据库内每个表 relfrozenxid 值的最小值。检查此信息的便捷方式是执行以下查询:

To track the age of the oldest unfrozen XIDs in a database, VACUUM stores XID statistics in the system tables pg_class and pg_database. In particular, the relfrozenxid column of a table’s pg_class row contains the oldest remaining unfrozen XID at the end of the most recent VACUUM that successfully advanced relfrozenxid (typically the most recent aggressive VACUUM). Similarly, the datfrozenxid column of a database’s pg_database row is a lower bound on the unfrozen XIDs appearing in that database — it is just the minimum of the per-table relfrozenxid values within the database. A convenient way to examine this information is to execute queries such as:

SELECT c.oid::regclass as table_name,
       greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm');

SELECT datname, age(datfrozenxid) FROM pg_database;

age 列度量从截止 XID 到当前事务的 XID 的事务数。

The age column measures the number of transactions from the cutoff XID to the current transaction’s XID.

Tip

当指定 VACUUM 命令的 VERBOSE 参数时, VACUUM 会打印关于该表的各种统计信息。这包括有关 relfrozenxidrelminmxid 如何推进的信息,以及新冻结页面的数量。当由自动 vacuum 报告的自动 vacuum 记录(由 log_autovacuum_min_duration 控制)对 VACUUM 操作执行操作时,服务器日志中也会出现相同的详细信息。

When the VACUUM command’s VERBOSE parameter is specified, VACUUM prints various statistics about the table. This includes information about how relfrozenxid and relminmxid advanced, and the number of newly frozen pages. The same details appear in the server log when autovacuum logging (controlled by log_autovacuum_min_duration) reports on a VACUUM operation executed by autovacuum.

VACUUM 通常只扫描自上次清理后已修改的页,但只有当扫描可能包含未冻结 XID 的表的所有页时,relfrozenxid 才能推进。在 relfrozenxid 超过 vacuum_freeze_table_age 个事务旧时、在使用 VACUUMFREEZE 选项时,或者在尚未全部冻结的所有页面都需要清理来移除死亡行版本时,会出现这种情况。当 VACUUM 扫描表中尚未全部冻结的每一页时,它应将 age(relfrozenxid) 设置为仅比所用 vacuum_freeze_min_age 设置稍大的值(比自 VACUUM 启动以来已启动的事务数多)。VACUUM 会将 relfrozenxid 设置为表中剩余的最旧 XID,因此最终值可能比严格要求的要新得多。如果在表上未发出 autovacuum_freeze_max_age 之前没有推进 relfrozenxid,则很快将为该表强制进行自动清理。

VACUUM normally only scans pages that have been modified since the last vacuum, but relfrozenxid can only be advanced when every page of the table that might contain unfrozen XIDs is scanned. This happens when relfrozenxid is more than vacuum_freeze_table_age transactions old, when VACUUM's FREEZE option is used, or when all pages that are not already all-frozen happen to require vacuuming to remove dead row versions. When VACUUM scans every page in the table that is not already all-frozen, it should set age(relfrozenxid) to a value just a little more than the vacuum_freeze_min_age setting that was used (more by the number of transactions started since the VACUUM started). VACUUM will set relfrozenxid to the oldest XID that remains in the table, so it’s possible that the final value will be much more recent than strictly required. If no relfrozenxid-advancing VACUUM is issued on the table until autovacuum_freeze_max_age is reached, an autovacuum will soon be forced for the table.

如果自动清理由于某种原因未能从某个表中清除旧的 XID,当数据库最旧的 XID 从早绕点达到 4000 万个事务时,系统将开始发出如下警告消息:

If for some reason autovacuum fails to clear old XIDs from a table, the system will begin to emit warning messages like this when the database’s oldest XIDs reach forty million transactions from the wraparound point:

WARNING:  database "mydb" must be vacuumed within 39985967 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.

(正如提示所建议,手动 VACUUM 应该修复此问题;但请注意,VACUUM 应该由超级用户执行,否则它将无法处理系统目录,从而阻止其推进数据库的 datfrozenxid。)如果这些警告被忽略,在距离早绕还有不到 300 万个事务时,系统将拒绝分配新 XID:

(A manual VACUUM should fix the problem, as suggested by the hint; but note that the VACUUM should be performed by a superuser, else it will fail to process system catalogs, which prevent it from being able to advance the database’s datfrozenxid.) If these warnings are ignored, the system will refuse to assign new XIDs once there are fewer than three million transactions left until wraparound:

ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
HINT:  Stop the postmaster and vacuum that database in single-user mode.

在此状态下,任何已在进行中的事务都可以继续,但只能启动只读事务。修改数据库记录或截断关系的操作将失败。仍可以正常运行 VACUUM 命令。与提示中说明的相反,无需停止 Postmaster 或进入单用户模式来恢复正常操作。而是执行以下步骤:

In this condition any transactions already in progress can continue, but only read-only transactions can be started. Operations that modify database records or truncate relations will fail. The VACUUM command can still be run normally. Contrary to what the hint states, it is not necessary or desirable to stop the postmaster or enter single user-mode in order to restore normal operation. Instead, follow these steps:

Note

在早期版本中,有时需要停止 postmaster 并以单用户模式 VACUUM 数据库。在典型情况下,这不再是必须的,并且应尽可能避免这种情况,因为它涉及系统关闭。这也是有风险的,因为它禁用了旨在防止数据丢失的事务 ID 环绕保护功能。在这种情况下使用单用户模式的唯一原因是,如果您希望 TRUNCATEDROP 不需要的表格,以避免 VACUUM 它们。三百万事务安全裕度存在是为了让管理员执行此操作。请参阅 postgres 参考页,了解有关使用单用户模式的详细信息。

In earlier versions, it was sometimes necessary to stop the postmaster and VACUUM the database in a single-user mode. In typical scenarios, this is no longer necessary, and should be avoided whenever possible, since it involves taking the system down. It is also riskier, since it disables transaction ID wraparound safeguards that are designed to prevent data loss. The only reason to use single-user mode in this scenario is if you wish to TRUNCATE or DROP unneeded tables to avoid needing to VACUUM them. The three-million-transaction safety margin exists to let the administrator do this. See the postgres reference page for details about using single-user mode.

25.1.5.1. Multixacts and Wraparound #

Multixact IDs 用于支持多重事务的行锁定。由于元组头中的空间有限,无法存储锁定信息,因此当出现多于一个事务同时锁定一行时,该信息将被编码为“多事务 ID”或简称为 multixact ID。关于哪些事务 ID 包含在任何特定 multixact ID 中的信息被单独存储在 pg_multixact 子目录中,并且只有 multixact ID 才会出现在元组头中的 xmax 字段中。与事务 ID 类似,multixact ID 以 32 位计数器与相应存储的形式实现,所有这些都需要仔细老化管理、存储清理和环绕处理。有一个单独的存储区域,其中保存每个 multixact 中的成员列表,它也使用 32 位计数器,并且也必须进行管理。

Multixact IDs are used to support row locking by multiple transactions. Since there is only limited space in a tuple header to store lock information, that information is encoded as a “multiple transaction ID”, or multixact ID for short, whenever there is more than one transaction concurrently locking a row. Information about which transaction IDs are included in any particular multixact ID is stored separately in the pg_multixact subdirectory, and only the multixact ID appears in the xmax field in the tuple header. Like transaction IDs, multixact IDs are implemented as a 32-bit counter and corresponding storage, all of which requires careful aging management, storage cleanup, and wraparound handling. There is a separate storage area which holds the list of members in each multixact, which also uses a 32-bit counter and which must also be managed.

无论何时 VACUUM 扫描表的任何部分,它都会用不同的值替换它遇到的任何早于 vacuum_multixact_freeze_min_age 的多事务 ID,这个值可以是零值,单个事务 ID 或较新的多事务 ID。对于每个表, pg_class.relminmxid 存储仍然出现在该表的任何元组中的最旧的多事务 ID。如果此值早于 vacuum_multixact_freeze_table_age ,则会强制进行激进的 vacuum。如图前一节所述,激进的 vacuum 意味着只会跳过已知已完全冻结的页面。可以在 pg_class.relminmxid 上使用 mxid_age() 来查找其年龄。

Whenever VACUUM scans any part of a table, it will replace any multixact ID it encounters which is older than vacuum_multixact_freeze_min_age by a different value, which can be the zero value, a single transaction ID, or a newer multixact ID. For each table, pg_class.relminmxid stores the oldest possible multixact ID still appearing in any tuple of that table. If this value is older than vacuum_multixact_freeze_table_age, an aggressive vacuum is forced. As discussed in the previous section, an aggressive vacuum means that only those pages which are known to be all-frozen will be skipped. mxid_age() can be used on pg_class.relminmxid to find its age.

攻击性 VACUUM_s, regardless of what causes them, are _guaranteed 推进表 relminmxid。最终随着所有数据库中的所有表都被扫描并推进其最旧的多事务值,可以移除旧的多事务磁盘存储。

Aggressive VACUUM_s, regardless of what causes them, are _guaranteed to be able to advance the table’s relminmxid. Eventually, as all tables in all databases are scanned and their oldest multixact values are advanced, on-disk storage for older multixacts can be removed.

作为安全设备,对于任何多事务年龄大于 autovacuum_multixact_freeze_max_age 的表,都会进行激进的 vacuum 扫描。此外,如果多事务成员占用的存储空间超过 2GB,则所有表的激进 vacuum 扫描都会更频繁地进行,从那些具有最旧多事务年龄的表开始。即使在标称禁用自动 vacuum 的情况下,也会发生这两种类型的激进扫描。

As a safety device, an aggressive vacuum scan will occur for any table whose multixact-age is greater than autovacuum_multixact_freeze_max_age. Also, if the storage occupied by multixacts members exceeds 2GB, aggressive vacuum scans will occur more often for all tables, starting with those that have the oldest multixact-age. Both of these kinds of aggressive scans will occur even if autovacuum is nominally disabled.

类似于 XID 情况,如果自动清理未能从表中清除旧的 MXID,则当数据库的最旧 MXID 从环绕点达到四千万个事务时,系统将开始发出警告消息。并且,类似于 XID 情况,如果这些警告被忽略,则当仅剩余不到三百万个 MXID 直至环绕点时,系统会拒绝生成新 MXID。

Similar to the XID case, if autovacuum fails to clear old MXIDs from a table, the system will begin to emit warning messages when the database’s oldest MXIDs reach forty million transactions from the wraparound point. And, just as an the XID case, if these warnings are ignored, the system will refuse to generate new MXIDs once there are fewer than three million left until wraparound.

当 MXID 耗尽时的正常运行可恢复,方式与 XID 耗尽时很类似。按照前一节的相同步骤进行,但有以下不同:

Normal operation when MXIDs are exhausted can be restored in much the same way as when XIDs are exhausted. Follow the same steps in the previous section, but with the following differences:

25.1.6. The Autovacuum Daemon #

PostgreSQL 有一项可选但强烈推荐的功能,称为 autovacuum ,其目的是自动执行 VACUUMANALYZE 命令。启用时,自动 vacuum 会检查已插入、更新或删除大量元组的表。这些检查使用统计信息收集工具,因此,除非将 track_counts 设置为 true ,否则无法使用自动 vacuum。在默认配置中,自动 vacuum 已启用并且相关的配置参数已正确设置。

PostgreSQL has an optional but highly recommended feature called autovacuum, whose purpose is to automate the execution of VACUUM and ANALYZE commands. When enabled, autovacuum checks for tables that have had a large number of inserted, updated or deleted tuples. These checks use the statistics collection facility; therefore, autovacuum cannot be used unless track_counts is set to true. In the default configuration, autovacuuming is enabled and the related configuration parameters are appropriately set.

“自动 vacuum 守护程序”实际上由多个进程组成。有一个持久的守护程序进程,称为 autovacuum launcher ,负责所有数据库的启动 autovacuum worker 进程。启动器将在时间上分配工作,尝试在每个数据库中每 autovacuum_naptime 秒启动一个工作进程。(因此,如果安装有 N 个数据库,每 autovacuum_naptime/N 秒将启动一个新工作进程。)最多允许 autovacuum_max_workers 个工作进程同时运行。如果要处理的数据库多于 autovacuum_max_workers 个,则第一个工作进程完成后,将立即处理下一个数据库。每个工作进程将在其数据库中检查每个表,并根据需要执行 VACUUM 和/或 ANALYZE 。可以将 log_autovacuum_min_duration 设置为监视自动 vacuum 工作进程的活动。

The “autovacuum daemon” actually consists of multiple processes. There is a persistent daemon process, called the autovacuum launcher, which is in charge of starting autovacuum worker processes for all databases. The launcher will distribute the work across time, attempting to start one worker within each database every autovacuum_naptime seconds. (Therefore, if the installation has N databases, a new worker will be launched every autovacuum_naptime/N seconds.) A maximum of autovacuum_max_workers worker processes are allowed to run at the same time. If there are more than autovacuum_max_workers databases to be processed, the next database will be processed as soon as the first worker finishes. Each worker process will check each table within its database and execute VACUUM and/or ANALYZE as needed. log_autovacuum_min_duration can be set to monitor autovacuum workers' activity.

如果几张大表在短时间内都符合真空条件,则所有自动真空工作进程都可能长时间忙于对这些表进行真空。这将导致其他表和数据库在工作进程可用之前不会被真空。单个数据库中的工作进程数量没有限制,但工作进程确实会尝试避免重复其他工作进程已经完成的工作。请注意,正在运行的工作进程的数量不计入 max_connectionssuperuser_reserved_connections 限制。

If several large tables all become eligible for vacuuming in a short amount of time, all autovacuum workers might become occupied with vacuuming those tables for a long period. This would result in other tables and databases not being vacuumed until a worker becomes available. There is no limit on how many workers might be in a single database, but workers do try to avoid repeating work that has already been done by other workers. Note that the number of running workers does not count towards max_connections or superuser_reserved_connections limits.

relfrozenxid 值比 autovacuum_freeze_max_age 事务旧的表总会被 vacuum(这也适用于冻结最大年龄通过存储参数进行修改的表;请参阅下文)。否则,如果自上次 VACUUM 废弃的元组数量超过“vacuum 阈值”,则该表将被 vacuum。vacuum 阈值定义为:

Tables whose relfrozenxid value is more than autovacuum_freeze_max_age transactions old are always vacuumed (this also applies to those tables whose freeze max age has been modified via storage parameters; see below). Otherwise, if the number of tuples obsoleted since the last VACUUM exceeds the “vacuum threshold”, the table is vacuumed. The vacuum threshold is defined as:

vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples

真空基线阈值为 autovacuum_vacuum_threshold,真空尺度因素为 autovacuum_vacuum_scale_factor,并且元组数为 pg_class.reltuples

where the vacuum base threshold is autovacuum_vacuum_threshold, the vacuum scale factor is autovacuum_vacuum_scale_factor, and the number of tuples is pg_class.reltuples.

如果自上次清理以来的插入元组数量超过定义的插入阈值,则也会清理表。插入阈值定义为:

The table is also vacuumed if the number of tuples inserted since the last vacuum has exceeded the defined insert threshold, which is defined as:

vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples

其中真空插入基线阈值为 autovacuum_vacuum_insert_threshold,并且真空插入尺度因素为 autovacuum_vacuum_insert_scale_factor。此类真空允许表的部分标记为 all visible,还允许冻结元组,这可以减少后续真空所需的工作。对于接收 INSERT 操作但几乎不接收 UPDATE/DELETE 操作的表,降低表的 autovacuum_freeze_min_age 可能是有利的,因为这允许更早的真空冻结元组。过时元组的数量和插入元组的数量从累积统计信息系统中获取;它是一个半精确计数,由每个 UPDATEDELETE_和 _INSERT 操作更新。(它只是半精确的,因为某些信息可能会在负载繁重的情况下丢失。)如果表的 relfrozenxid 值超过了 vacuum_freeze_table_age 个事务旧值,则会执行激进的真空以冻结旧元组并推进 relfrozenxid;否则,只会扫描自上次真空后已修改的页面。

where the vacuum insert base threshold is autovacuum_vacuum_insert_threshold, and vacuum insert scale factor is autovacuum_vacuum_insert_scale_factor. Such vacuums may allow portions of the table to be marked as all visible and also allow tuples to be frozen, which can reduce the work required in subsequent vacuums. For tables which receive INSERT operations but no or almost no UPDATE/DELETE operations, it may be beneficial to lower the table’s autovacuum_freeze_min_age as this may allow tuples to be frozen by earlier vacuums. The number of obsolete tuples and the number of inserted tuples are obtained from the cumulative statistics system; it is a semi-accurate count updated by each UPDATE, DELETE and INSERT operation. (It is only semi-accurate because some information might be lost under heavy load.) If the relfrozenxid value of the table is more than vacuum_freeze_table_age transactions old, an aggressive vacuum is performed to freeze old tuples and advance relfrozenxid; otherwise, only pages that have been modified since the last vacuum are scanned.

对于分析,使用类似的条件:阈值(定义为:

For analyze, a similar condition is used: the threshold, defined as:

analyze threshold = analyze base threshold + analyze scale factor * number of tuples

与自上次 ANALYZE 以来插入、更新或删除的总元组数量相比较。

is compared to the total number of tuples inserted, updated, or deleted since the last ANALYZE.

分区表不直接存储元组,因此不受自动清理处理。(自动清理像其他表一样处理表分区。)不幸的是,这意味着自动清理不在分区表上运行 ANALYZE,而这会导致针对分区表统计信息发出查询的计划欠佳。可以通过在分区表首次填充时手动在分区表上运行 ANALYZE 然后在分区的分布数据发生显著变化时再次运行 ANALYZE 来解决此问题。

Partitioned tables do not directly store tuples and consequently are not processed by autovacuum. (Autovacuum does process table partitions just like other tables.) Unfortunately, this means that autovacuum does not run ANALYZE on partitioned tables, and this can cause suboptimal plans for queries that reference partitioned table statistics. You can work around this problem by manually running ANALYZE on partitioned tables when they are first populated, and again whenever the distribution of data in their partitions changes significantly.

自动清理无法访问临时表。因此,应该通过会话 SQL 命令执行适当的清理和分析操作。

Temporary tables cannot be accessed by autovacuum. Therefore, appropriate vacuum and analyze operations should be performed via session SQL commands.

默认阈值和尺度因素取自 postgresql.conf,但可以在单表基础上覆盖它们(以及许多其他自动真空控制参数);有关更多信息,请参见 Storage Parameters。如果已通过表的存储参数更改设置,则在处理该表时使用该值;否则,会使用全局设置。有关全局设置的更多详细信息,请参见 Section 20.10

The default thresholds and scale factors are taken from postgresql.conf, but it is possible to override them (and many other autovacuum control parameters) on a per-table basis; see Storage Parameters for more information. If a setting has been changed via a table’s storage parameters, that value is used when processing that table; otherwise the global settings are used. See Section 20.10 for more details on the global settings.

当多个工作程序正在运行时,自动真空成本延迟参数(参见 Section 20.4.4)在所有正在运行的工作程序之间“平衡”,以便系统上的总 I/O 影响与实际运行的工作程序数量无关。但是,在平衡算法中不考虑处理单表 autovacuum_vacuum_cost_delayautovacuum_vacuum_cost_limit 存储参数已设置的表的任何工作程序。

When multiple workers are running, the autovacuum cost delay parameters (see Section 20.4.4) are “balanced” among all the running workers, so that the total I/O impact on the system is the same regardless of the number of workers actually running. However, any workers processing tables whose per-table autovacuum_vacuum_cost_delay or autovacuum_vacuum_cost_limit storage parameters have been set are not considered in the balancing algorithm.

自动真空工作程序通常不会阻塞其他命令。如果进程尝试获取与自动真空持有的 SHARE UPDATE EXCLUSIVE 锁冲突的锁,则锁获取会中断自动真空。有关冲突锁模式,请参见 Table 13.2。但是,如果自动真空正在运行以防止事务 ID 换行(即,pg_stat_activity 视图中的自动真空查询名称以 (to prevent wraparound) 结尾),则不会自动中断自动真空。

Autovacuum workers generally don’t block other commands. If a process attempts to acquire a lock that conflicts with the SHARE UPDATE EXCLUSIVE lock held by autovacuum, lock acquisition will interrupt the autovacuum. For conflicting lock modes, see Table 13.2. However, if the autovacuum is running to prevent transaction ID wraparound (i.e., the autovacuum query name in the pg_stat_activity view ends with (to prevent wraparound)), the autovacuum is not automatically interrupted.

Warning

定期运行获取与 SHARE UPDATE EXCLUSIVE 锁发生冲突的锁的命令(例如,ANALYZE)可以有效地阻止自动清理完成。

Regularly running commands that acquire locks conflicting with a SHARE UPDATE EXCLUSIVE lock (e.g., ANALYZE) can effectively prevent autovacuums from ever completing.