Postgresql 中文操作指南

11.1. Introduction #

假设我们有一张类似以下内容的表:

Suppose we have a table similar to this:

CREATE TABLE test1 (
    id integer,
    content varchar
);

应用程序发出很多类似于以下形式的查询:

and the application issues many queries of the form:

SELECT content FROM test1 WHERE id = constant;

如果没有事先准备,系统将不得不逐行扫描整个 test1 表以查找所有匹配的条目。如果 test1 中有很多行,而此类查询只会返回几行(可能是零行或一行),这显然是一种低效的方法。但如果已指导系统对 id 列维护一个索引,则可以使用更有效的方法来查找匹配的行。例如,它可能只需深入搜索树几个层级。

With no advance preparation, the system would have to scan the entire test1 table, row by row, to find all matching entries. If there are many rows in test1 and only a few rows (perhaps zero or one) that would be returned by such a query, this is clearly an inefficient method. But if the system has been instructed to maintain an index on the id column, it can use a more efficient method for locating matching rows. For instance, it might only have to walk a few levels deep into a search tree.

大多数非小说类书籍都采用了类似的方法:读者经常查找的术语和概念被收集在书末的按字母顺序排列的索引中。感兴趣的读者可以相对较快地扫描索引并翻到相应的页面,而不是阅读整本书以找到感兴趣的内容。就像作者的任务是预见读者可能查找的项目一样,数据库程序员的任务是预见哪些索引有用。

A similar approach is used in most non-fiction books: terms and concepts that are frequently looked up by readers are collected in an alphabetic index at the end of the book. The interested reader can scan the index relatively quickly and flip to the appropriate page(s), rather than having to read the entire book to find the material of interest. Just as it is the task of the author to anticipate the items that readers are likely to look up, it is the task of the database programmer to foresee which indexes will be useful.

以下命令可用于创建 id 列上的索引,如上所述:

The following command can be used to create an index on the id column, as discussed:

CREATE INDEX test1_id_index ON test1 (id);

名称 test1_id_index 可以自由选择,但您应该选择一些内容使您能够在以后记住索引的用途。

The name test1_id_index can be chosen freely, but you should pick something that enables you to remember later what the index was for.

要删除索引,请使用 DROP INDEX 命令。可以随时向表添加或从表删除索引。

To remove an index, use the DROP INDEX command. Indexes can be added to and removed from tables at any time.

一旦创建了索引,则无需进一步干预:当修改表时,系统将更新索引,并且在查询中使用索引时,它认为这样做的效率高于顺序表扫描。但是,您可能需要定期运行 ANALYZE 命令来更新统计信息,以便查询计划程序可以做出明智的决策。有关如何确定是否使用索引以及计划程序何时何原因决定 not 使用索引的信息,请参阅 Chapter 14

Once an index is created, no further intervention is required: the system will update the index when the table is modified, and it will use the index in queries when it thinks doing so would be more efficient than a sequential table scan. But you might have to run the ANALYZE command regularly to update statistics to allow the query planner to make educated decisions. See Chapter 14 for information about how to find out whether an index is used and when and why the planner might choose not to use an index.

在具有搜索条件的 UPDATEDELETE 命令中,索引也可以受益。此外,可以在联接搜索中使用索引。因此,在联接条件中以部分形式定义的索引也可以显著加速具有联接的查询。

Indexes can also benefit UPDATE and DELETE commands with search conditions. Indexes can moreover be used in join searches. Thus, an index defined on a column that is part of a join condition can also significantly speed up queries with joins.

通常,PostgreSQL 索引可用于优化包含一个或多个以下形式的 WHEREJOIN 子句的查询

In general, PostgreSQL indexes can be used to optimize queries that contain one or more WHERE or JOIN clauses of the form

indexed-column indexable-operator comparison-value

此处,indexed-column 是索引所定义所在的列或表达式。indexable-operator 是运算符,是索引列的索引 operator class 的成员。(有关该运算符的详细信息如下所示。)而 comparison-value 可以是不稳定的任何表达式,且不引用索引表。

Here, the indexed-column is whatever column or expression the index has been defined on. The indexable-operator is an operator that is a member of the index’s operator class for the indexed column. (More details about that appear below.) And the comparison-value can be any expression that is not volatile and does not reference the index’s table.

在某些情况下,查询计划器可以从此类 SQL 结构中提取可索引子句。一个简单的示例是,如果原始子句为

In some cases the query planner can extract an indexable clause of this form from another SQL construct. A simple example is that if the original clause was

comparison-value operator indexed-column

那么,如果原始 operator 带有 commutator 运算符且它为索引运算符类成员,则可以将其翻转为可索引表单。

then it can be flipped around into indexable form if the original operator has a commutator operator that is a member of the index’s operator class.

在大型表上创建索引可能需要很长时间。默认情况下,PostgreSQL 允许在与索引创建并行的情况下对表进行读写(SELECT 语句),但在索引构建完成之前,写操作(INSERTUPDATEDELETE)被阻塞。在生产环境中,这种情况通常是不可接受的。可以让写入与索引创建并行进行,但需要注意几个注意事项 — 有关详细信息,请参阅 Building Indexes Concurrently

Creating an index on a large table can take a long time. By default, PostgreSQL allows reads (SELECT statements) to occur on the table in parallel with index creation, but writes (INSERT, UPDATE, DELETE) are blocked until the index build is finished. In production environments this is often unacceptable. It is possible to allow writes to occur in parallel with index creation, but there are several caveats to be aware of — for more information see Building Indexes Concurrently.

在创建索引后,系统必须使其与表保持同步。这会给数据处理操作增加开销。索引还可以防止创建 heap-only tuples。因此,在查询中很少或从未使用的索引应被删除。

After an index is created, the system has to keep it synchronized with the table. This adds overhead to data manipulation operations. Indexes can also prevent the creation of heap-only tuples. Therefore indexes that are seldom or never used in queries should be removed.