Postgresql 中文操作指南

REINDEX

REINDEX — rebuild indexes

Synopsis

REINDEX [ ( option [, ...] ) ] { INDEX | TABLE | SCHEMA } [ CONCURRENTLY ] name
REINDEX [ ( option [, ...] ) ] { DATABASE | SYSTEM } [ CONCURRENTLY ] [ name ]

where option can be one of:

    CONCURRENTLY [ boolean ]
    TABLESPACE new_tablespace
    VERBOSE [ boolean ]

Description

REINDEX 使用存储在索引的表中的数据重建索引,替换索引的旧副本。使用 REINDEX 的场景有多种:

REINDEX rebuilds an index using the data stored in the index’s table, replacing the old copy of the index. There are several scenarios in which to use REINDEX:

Parameters

  • INDEX

    • Recreate the specified index. This form of REINDEX cannot be executed inside a transaction block when used with a partitioned index.

  • TABLE

    • Recreate all indexes of the specified table. If the table has a secondary “TOAST” table, that is reindexed as well. This form of REINDEX cannot be executed inside a transaction block when used with a partitioned table.

  • SCHEMA

    • Recreate all indexes of the specified schema. If a table of this schema has a secondary “TOAST” table, that is reindexed as well. Indexes on shared system catalogs are also processed. This form of REINDEX cannot be executed inside a transaction block.

  • DATABASE

    • Recreate all indexes within the current database, except system catalogs. Indexes on system catalogs are not processed. This form of REINDEX cannot be executed inside a transaction block.

  • SYSTEM

    • Recreate all indexes on system catalogs within the current database. Indexes on shared system catalogs are included. Indexes on user tables are not processed. This form of REINDEX cannot be executed inside a transaction block.

  • name

    • The name of the specific index, table, or database to be reindexed. Index and table names can be schema-qualified. Presently, REINDEX DATABASE and REINDEX SYSTEM can only reindex the current database. Their parameter is optional, and it must match the current database’s name.

  • CONCURRENTLY

    • When this option is used, PostgreSQL will rebuild the index without taking any locks that prevent concurrent inserts, updates, or deletes on the table; whereas a standard index rebuild locks out writes (but not reads) on the table until it’s done. There are several caveats to be aware of when using this option — see Rebuilding Indexes Concurrently below.

    • For temporary tables, REINDEX is always non-concurrent, as no other session can access them, and non-concurrent reindex is cheaper.

  • TABLESPACE

    • Specifies that indexes will be rebuilt on a new tablespace.

  • VERBOSE

    • Prints a progress report as each index is reindexed.

  • boolean

    • Specifies whether the selected option should be turned on or off. You can write TRUE, ON, or 1 to enable the option, and FALSE, OFF, or 0 to disable it. The boolean value can also be omitted, in which case TRUE is assumed.

  • new_tablespace

    • The tablespace where indexes will be rebuilt.

Notes

如果您怀疑用户表上的索引已损坏,您可以使用 REINDEX INDEXREINDEX TABLE 简单地重新构建该索引或该表上的全部索引。

If you suspect corruption of an index on a user table, you can simply rebuild that index, or all indexes on the table, using REINDEX INDEX or REINDEX TABLE.

如果您需要从系统表上索引的损坏中进行恢复,那么情况就会变得更复杂。在这种情况下,系统本身不要使用任何可疑索引非常重要。(事实上,在这种情况下您可能会发现服务器进程立即在启动时崩溃,由于依赖于损坏的索引。)为了进行安全恢复,必须使用 -P 选项启动服务器,这会阻止它使用系统目录查找的索引。

Things are more difficult if you need to recover from corruption of an index on a system table. In this case it’s important for the system to not have used any of the suspect indexes itself. (Indeed, in this sort of scenario you might find that server processes are crashing immediately at start-up, due to reliance on the corrupted indexes.) To recover safely, the server must be started with the -P option, which prevents it from using indexes for system catalog lookups.

执行此操作的一种方法是关闭服务器,并通过命令行中包含 -P 选项来启动一个单用户 PostgreSQL 服务器。然后,可执行 REINDEX DATABASEREINDEX SYSTEMREINDEX TABLEREINDEX INDEX ,具体视你需要重建的内容而定。如有疑问,请使用 REINDEX SYSTEM 以选择重建数据库中所有系统索引。然后退出单用户服务器会话并重新启动常规服务器。请参阅 postgres 参考页面以获取有关如何与单用户服务器界面进行交互的详细信息。

One way to do this is to shut down the server and start a single-user PostgreSQL server with the -P option included on its command line. Then, REINDEX DATABASE, REINDEX SYSTEM, REINDEX TABLE, or REINDEX INDEX can be issued, depending on how much you want to reconstruct. If in doubt, use REINDEX SYSTEM to select reconstruction of all system indexes in the database. Then quit the single-user server session and restart the regular server. See the postgres reference page for more information about how to interact with the single-user server interface.

另外,可在其命令行选项中包括 -P 以启动常规的服务器会话。执行此操作的方法因客户端而异,但在所有基于 libpq 的客户端中,都可以在启动客户端之前将 PGOPTIONS 环境变量设置为 -P 。请注意,虽然此方法不需要锁定其他客户端,但在修复完成之前,仍建议禁止其他用户连接到损坏的数据库。

Alternatively, a regular server session can be started with -P included in its command line options. The method for doing this varies across clients, but in all libpq-based clients, it is possible to set the PGOPTIONS environment variable to -P before starting the client. Note that while this method does not require locking out other clients, it might still be wise to prevent other users from connecting to the damaged database until repairs have been completed.

REINDEX 与 index 的删除和创建类似,因为 index 的内容是从头重建的。但是,锁定方面有所不同。 REINDEX 不会锁定写操作,但会锁定对其父表进行的读取操作。它还会对正在处理的特定索引加 ACCESS EXCLUSIVE 锁,这会阻塞尝试使用该索引的读取操作。具体来说,无论查询如何,查询计划程序都会尝试在表的每个索引上加 ACCESS SHARE 锁,因此 REINDEX 实际上会阻塞除一些计划已缓存且不使用此索引的已准备查询之外的所有查询。与之相比, DROP INDEX 会在父表上瞬间加 ACCESS EXCLUSIVE 锁,从而阻塞写操作和读取操作。 CREATE INDEX 也会阻塞写操作,但不会阻塞读取操作;由于没有索引,因此所有读取操作都不会尝试使用它,这意味着不会发生阻塞,但可能会强制将读取操作转为昂贵的顺序扫描。

REINDEX is similar to a drop and recreate of the index in that the index contents are rebuilt from scratch. However, the locking considerations are rather different. REINDEX locks out writes but not reads of the index’s parent table. It also takes an ACCESS EXCLUSIVE lock on the specific index being processed, which will block reads that attempt to use that index. In particular, the query planner tries to take an ACCESS SHARE lock on every index of the table, regardless of the query, and so REINDEX blocks virtually any queries except for some prepared queries whose plan has been cached and which don’t use this very index. In contrast, DROP INDEX momentarily takes an ACCESS EXCLUSIVE lock on the parent table, blocking both writes and reads. The subsequent CREATE INDEX locks out writes but not reads; since the index is not there, no read will attempt to use it, meaning that there will be no blocking but reads might be forced into expensive sequential scans.

对单个索引或表重新建立索引需要拥有该索引或表的权限。对模式或数据库重新建立索引需要拥有该模式或数据库的权限。特别注意,非超级用户因此可以重建其他用户拥有的表的索引。但是, REINDEX DATABASEREINDEX SCHEMAREINDEX SYSTEM 由非超级用户发出时,会出现一个特殊情况,其中共享目录的索引将会被跳过,除非该用户拥有该目录(通常不会这样)。当然,超级用户总是可以对任何内容重新建立索引。

Reindexing a single index or table requires being the owner of that index or table. Reindexing a schema or database requires being the owner of that schema or database. Note specifically that it’s thus possible for non-superusers to rebuild indexes of tables owned by other users. However, as a special exception, when REINDEX DATABASE, REINDEX SCHEMA or REINDEX SYSTEM is issued by a non-superuser, indexes on shared catalogs will be skipped unless the user owns the catalog (which typically won’t be the case). Of course, superusers can always reindex anything.

通过 REINDEX INDEXREINDEX TABLE 分别支持对分区索引或分区表进行重新编制索引。指定分区关系的每个分区都在单独的事务中重新建立索引。在对分区表或索引进行处理时,不能在事务块中使用这些命令。

Reindexing partitioned indexes or partitioned tables is supported with REINDEX INDEX or REINDEX TABLE, respectively. Each partition of the specified partitioned relation is reindexed in a separate transaction. Those commands cannot be used inside a transaction block when working on a partitioned table or index.

在分区索引或表上使用 REINDEX 子句和 TABLESPACE 时,只会更新叶分区的数据表空间引用信息。由于不更新分区索引,建议在该索引上分别使用 ALTER TABLE ONLY ,以便附加的任何新分区都继承新数据表空间。如果失败,则可能无法将所有索引移至新数据表空间。重新运行该命令将重建所有叶分区,并将前述未处理的索引移至新数据表空间。

When using the TABLESPACE clause with REINDEX on a partitioned index or table, only the tablespace references of the leaf partitions are updated. As partitioned indexes are not updated, it is recommended to separately use ALTER TABLE ONLY on them so as any new partitions attached inherit the new tablespace. On failure, it may not have moved all the indexes to the new tablespace. Re-running the command will rebuild all the leaf partitions and move previously-unprocessed indexes to the new tablespace.

如果将 SCHEMADATABASESYSTEMTABLESPACE 一起使用,则将会跳过系统关系,并且将会生成单个 WARNING 。TOAST 表的索引会重建,但不会移至新数据表空间。

If SCHEMA, DATABASE or SYSTEM is used with TABLESPACE, system relations are skipped and a single WARNING will be generated. Indexes on TOAST tables are rebuilt, but not moved to the new tablespace.

Rebuilding Indexes Concurrently

重建索引可能会干扰数据库的常规操作。通常,PostgreSQL 会在写操作中针对索引重建锁定要被重建的表,并使用一次表扫描来执行整个索引构建。其他事务仍然可以读取表,但如果它们尝试在表中插入、更新或删除行,则它们将阻塞,直到索引重建完成为止。如果该系统是生产数据库,则这可能会产生严重影响。非常大的表可能需要花费很多小时才能建立索引,即使对于较小的表,索引重建也会将写操作锁定到对于生产系统而言不可接受的时间。

Rebuilding an index can interfere with regular operation of a database. Normally PostgreSQL locks the table whose index is rebuilt against writes and performs the entire index build with a single scan of the table. Other transactions can still read the table, but if they try to insert, update, or delete rows in the table they will block until the index rebuild is finished. This could have a severe effect if the system is a live production database. Very large tables can take many hours to be indexed, and even for smaller tables, an index rebuild can lock out writers for periods that are unacceptably long for a production system.

PostgreSQL 支持以最少的写锁定重建索引。此方法是通过指定 REINDEXCONCURRENTLY 选项来调用的。使用此选项时,PostgreSQL 必须为需要重建的每个索引扫描两次表,并等待所有可能使用该索引的现有事务终止。此方法比标准索引重建需要更多总体工作,并且由于它需要等待可能修改索引的未完成事务而花费更长的时间才能完成。但是,由于此方法允许在重建索引时继续进行正常操作,因此此方法对于在生产环境中重建索引很有用。当然,索引重建强加的额外 CPU、内存和 I/O 负载可能会减慢其他操作。

PostgreSQL supports rebuilding indexes with minimum locking of writes. This method is invoked by specifying the CONCURRENTLY option of REINDEX. When this option is used, PostgreSQL must perform two scans of the table for each index that needs to be rebuilt and wait for termination of all existing transactions that could potentially use the index. This method requires more total work than a standard index rebuild and takes significantly longer to complete as it needs to wait for unfinished transactions that might modify the index. However, since it allows normal operations to continue while the index is being rebuilt, this method is useful for rebuilding indexes in a production environment. Of course, the extra CPU, memory and I/O load imposed by the index rebuild may slow down other operations.

并行重新建立索引中会发生以下步骤。每个步骤都在单独的事务中运行。如果有需要重建的多个索引,则每个步骤都会遍历所有索引,然后再移动到下一步骤。

The following steps occur in a concurrent reindex. Each step is run in a separate transaction. If there are multiple indexes to be rebuilt, then each step loops through all the indexes before moving to the next step.

如果在重建索引时出现问题,例如唯一索引中的唯一性冲突,则 REINDEX 命令将失败,但除了预先存在的索引之外,还会遗留下一个“无效”的新索引。此索引将被忽略以用于查询,因为它可能不完整;但是,它仍然会消耗更新开销。psql \d 命令将报告此类索引为 INVALID

If a problem arises while rebuilding the indexes, such as a uniqueness violation in a unique index, the REINDEX command will fail but leave behind an “invalid” new index in addition to the pre-existing one. This index will be ignored for querying purposes because it might be incomplete; however it will still consume update overhead. The psql \d command will report such an index as INVALID:

postgres=# \d tab
       Table "public.tab"
 Column |  Type   | Modifiers
--------+---------+-----------
 col    | integer |
Indexes:
    "idx" btree (col)
    "idx_ccnew" btree (col) INVALID

如果标记为 INVALID 的索引紧跟着 ccnew ,则它对应于在并行操作期间创建的瞬态索引,推荐的恢复方法是使用 DROP INDEX 删除它,然后再次尝试 REINDEX CONCURRENTLY 。如果无效索引紧跟着 ccold ,则它对应于无法删除的原始索引;推荐的恢复方法是只删除所述索引,因为重建本身已经成功。

If the index marked INVALID is suffixed ccnew, then it corresponds to the transient index created during the concurrent operation, and the recommended recovery method is to drop it using DROP INDEX, then attempt REINDEX CONCURRENTLY again. If the invalid index is instead suffixed ccold, it corresponds to the original index which could not be dropped; the recommended recovery method is to just drop said index, since the rebuild proper has been successful.

常规索引构建允许在同一表上同时发生其他常规索引构建,但一次只能在表上发生一次并行索引构建。在两种情况下,在此期间都不允许对表进行其他类型的模式修改。另一个区别是可以在事务块中执行常规 REINDEX TABLEREINDEX INDEX 命令,但不能在事务块中执行 REINDEX CONCURRENTLY

Regular index builds permit other regular index builds on the same table to occur simultaneously, but only one concurrent index build can occur on a table at a time. In both cases, no other types of schema modification on the table are allowed meanwhile. Another difference is that a regular REINDEX TABLE or REINDEX INDEX command can be performed within a transaction block, but REINDEX CONCURRENTLY cannot.

与任何长时间运行的事务一样,表上的 REINDEX 可能会影响其他任何表上的并行 VACUUM 可以移除哪些元组。

Like any long-running transaction, REINDEX on a table can affect which tuples can be removed by concurrent VACUUM on any other table.

REINDEX SYSTEM 不支持 CONCURRENTLY ,因为无法并发重新建立系统目录索引。

REINDEX SYSTEM does not support CONCURRENTLY since system catalogs cannot be reindexed concurrently.

此外,不能并发重新建立排除约束的索引。如果在此命令中直接命名了此类索引,则会引发错误。如果对具有排除约束索引的表或数据库进行并行重新建立索引,则将会跳过那些索引。(有可能在没有 CONCURRENTLY 选项的情况下重新建立此类索引)。

Furthermore, indexes for exclusion constraints cannot be reindexed concurrently. If such an index is named directly in this command, an error is raised. If a table or database with exclusion constraint indexes is reindexed concurrently, those indexes will be skipped. (It is possible to reindex such indexes without the CONCURRENTLY option.)

运行 REINDEX 的每个后端将在 pg_stat_progress_create_index 视图中报告其进度。详情请参阅 Section 28.4.4

Each backend running REINDEX will report its progress in the pg_stat_progress_create_index view. See Section 28.4.4 for details.

Examples

重建单个索引:

Rebuild a single index:

REINDEX INDEX my_index;

重建表 my_table 上的所有索引:

Rebuild all the indexes on the table my_table:

REINDEX TABLE my_table;

在不信任系统索引已有效的情况下,重新构建特定数据库中的所有索引:

Rebuild all indexes in a particular database, without trusting the system indexes to be valid already:

$ export PGOPTIONS="-P"
$ psql broken_db
...
broken_db=> REINDEX DATABASE broken_db;
broken_db=> \q

为一个表重新构建索引,在重新索引进行时,不阻止对涉及关系的读写操作:

Rebuild indexes for a table, without blocking read and write operations on involved relations while reindexing is in progress:

REINDEX TABLE CONCURRENTLY my_broken_table;

Compatibility

SQL 标准中没有 REINDEX 命令。

There is no REINDEX command in the SQL standard.