Postgresql 中文操作指南
14.1. Using EXPLAIN #
PostgreSQL 为收到的每个查询设计一个 query plan 。选择与查询结构和数据属性匹配的正确计划对于良好的性能至关重要,因此系统包括一个复杂的 planner ,它尝试选择好的计划。可以使用 EXPLAIN 命令查看计划程序为任何查询创建的查询计划。计划读取是一门需要一些经验才能掌握的艺术,但本部分尝试涵盖基础知识。
PostgreSQL devises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is absolutely critical for good performance, so the system includes a complex planner that tries to choose good plans. You can use the EXPLAIN command to see what query plan the planner creates for any query. Plan-reading is an art that requires some experience to master, but this section attempts to cover the basics.
本节中的示例摘自回归测试数据库,在完成 VACUUM ANALYZE 后使用 9.3 开发源进行。如果您自己尝试这些示例,应该能够获得类似的结果,但您的估计成本和行计数可能略有不同,因为 ANALYZE 的统计数据是随机样本而不是精确的,并且成本在本质上是有点依赖平台的。
Examples in this section are drawn from the regression test database after doing a VACUUM ANALYZE, using 9.3 development sources. You should be able to get similar results if you try the examples yourself, but your estimated costs and row counts might vary slightly because ANALYZE's statistics are random samples rather than exact, and because costs are inherently somewhat platform-dependent.
这些示例使用 EXPLAIN 默认的“文本”输出格式,该格式紧凑且便于人们读取。如果您要将 EXPLAIN 的输出发送到程序进行进一步分析,则应改用一种机器可读的输出格式 (XML、JSON 或 YAML)。
The examples use EXPLAIN's default “text” output format, which is compact and convenient for humans to read. If you want to feed EXPLAIN's output to a program for further analysis, you should use one of its machine-readable output formats (XML, JSON, or YAML) instead.
14.1.1. EXPLAIN Basics #
查询计划的结构是一个 plan nodes 树。树底层节点是扫描节点:它们从表中返回原始行。针对不同的表访问方法,有不同类型的扫描节点:顺序扫描、索引扫描和位图索引扫描。还有一些非表行源,例如 VALUES 从句和 FROM 中的集合返回函数,它们具有自己的扫描节点类型。如果查询需要对原始行连接、聚合、排序或执行其他操作,则会在扫描节点的上方添加其他节点来执行这些操作。同样,通常有多种方法可以执行这些操作,因此此处也可以出现不同的节点类型。EXPLAIN 的输出在计划树中的每个节点都有一个行,其中显示基本节点类型以及规划器为该计划节点的执行估算的成本。其他行可能会从节点的摘要行缩进显示,以显示节点的其他属性。第一行(最上层节点的摘要行)具有计划的估计总执行成本;规划器寻求最小化该数字。
The structure of a query plan is a tree of plan nodes. Nodes at the bottom level of the tree are scan nodes: they return raw rows from a table. There are different types of scan nodes for different table access methods: sequential scans, index scans, and bitmap index scans. There are also non-table row sources, such as VALUES clauses and set-returning functions in FROM, which have their own scan node types. If the query requires joining, aggregation, sorting, or other operations on the raw rows, then there will be additional nodes above the scan nodes to perform these operations. Again, there is usually more than one possible way to do these operations, so different node types can appear here too. The output of EXPLAIN has one line for each node in the plan tree, showing the basic node type plus the cost estimates that the planner made for the execution of that plan node. Additional lines might appear, indented from the node’s summary line, to show additional properties of the node. The very first line (the summary line for the topmost node) has the estimated total execution cost for the plan; it is this number that the planner seeks to minimize.
这里有一个琐碎的示例,仅用来展示输出是什么样的:
Here is a trivial example, just to show what the output looks like:
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
由于此查询没有 WHERE 子句,因此它必须扫描表的所有行,所以计划程序选择使用简单的顺序扫描计划。括号中引用的数字(从左到右):
Since this query has no WHERE clause, it must scan all the rows of the table, so the planner has chosen to use a simple sequential scan plan. The numbers that are quoted in parentheses are (left to right):
成本以规划器的成本参数决定的任意单位来衡量(请参阅 Section 20.7.2)。传统做法是用磁盘页面取回的单位来衡量成本;也就是说,将 seq_page_cost 惯例设置为 1.0,并将其他成本参数相对于该设置。本节中的示例使用默认成本参数运行。
The costs are measured in arbitrary units determined by the planner’s cost parameters (see Section 20.7.2). Traditional practice is to measure the costs in units of disk page fetches; that is, seq_page_cost is conventionally set to 1.0 and the other cost parameters are set relative to that. The examples in this section are run with the default cost parameters.
了解上层节点的成本包括所有子节点成本这一点很重要。同样重要的是要意识到,成本仅反映计划程序关心的事情。特别是,成本不考虑将结果行传输到客户端所花费的时间,这可能是实际经过时间的关键因素;但计划程序忽略它,因为它无法通过修改计划来更改它。(我们相信每个正确的计划都将输出相同的行集。)
It’s important to understand that the cost of an upper-level node includes the cost of all its child nodes. It’s also important to realize that the cost only reflects things that the planner cares about. In particular, the cost does not consider the time spent transmitting result rows to the client, which could be an important factor in the real elapsed time; but the planner ignores it because it cannot change it by altering the plan. (Every correct plan will output the same row set, we trust.)
rows 值有点棘手,因为它不是计划节点处理或扫描的行数,而是该节点发出的数量。由于在节点上应用的任何 WHERE 子句条件进行筛选,它通常少于扫描数量。理想情况下,顶级行估计会近似于查询实际返回、更新或删除的行数。
The rows value is a little tricky because it is not the number of rows processed or scanned by the plan node, but rather the number emitted by the node. This is often less than the number scanned, as a result of filtering by any WHERE-clause conditions that are being applied at the node. Ideally the top-level rows estimate will approximate the number of rows actually returned, updated, or deleted by the query.
回到示例:
Returning to our example:
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
这些数字的派生非常简单。如果您执行:
These numbers are derived very straightforwardly. If you do:
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
你将发现 tenk1 有 358 个磁盘页面和 10000 行。估计成本计算如下:(读取的磁盘页面 * seq_page_cost)+(扫描的行 * cpu_tuple_cost)。默认情况下,seq_page_cost 为 1.0,而 cpu_tuple_cost 为 0.01,因此估计成本为 (358 * 1.0) + (10000 * 0.01) = 458。
you will find that tenk1 has 358 disk pages and 10000 rows. The estimated cost is computed as (disk pages read * seq_page_cost) + (rows scanned * cpu_tuple_cost). By default, seq_page_cost is 1.0 and cpu_tuple_cost is 0.01, so the estimated cost is (358 * 1.0) + (10000 * 0.01) = 458.
现在,让我们修改查询以添加 WHERE 条件:
Now let’s modify the query to add a WHERE condition:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 7000;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..483.00 rows=7001 width=244)
Filter: (unique1 < 7000)
请注意,EXPLAIN 输出显示 WHERE 从句作为应用于序列扫描计划节点的“筛选”条件。这意味着计划节点会检查它扫描的每一行的条件,并且仅输出通过条件的行。由于 WHERE 从句,输出行的估计值已减少。然而,扫描仍然必须访问所有 10000 行,因此成本并未降低;事实上,它略有上升(精确地说,为 10000 * cpu_operator_cost),以反映检查 WHERE 条件所花费的额外 CPU 时间。
Notice that the EXPLAIN output shows the WHERE clause being applied as a “filter” condition attached to the Seq Scan plan node. This means that the plan node checks the condition for each row it scans, and outputs only the ones that pass the condition. The estimate of output rows has been reduced because of the WHERE clause. However, the scan will still have to visit all 10000 rows, so the cost hasn’t decreased; in fact it has gone up a bit (by 10000 * cpu_operator_cost, to be exact) to reflect the extra CPU time spent checking the WHERE condition.
此查询将选择的实际行数为 7000,但 rows 估计值仅为近似值。如果您尝试复制此实验,您可能将得到略有不同的估计值;此外,它可以 ANALYZE 每次命令后发生变化,因为 ANALYZE 生成的统计信息取自表的随机样本。
The actual number of rows this query would select is 7000, but the rows estimate is only approximate. If you try to duplicate this experiment, you will probably get a slightly different estimate; moreover, it can change after each ANALYZE command, because the statistics produced by ANALYZE are taken from a randomized sample of the table.
现在,让我们使条件更具限制性:
Now, let’s make the condition more restrictive:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100;
QUERY PLAN
------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=5.07..229.20 rows=101 width=244)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
在这里,计划程序决定使用两步计划:子计划节点访问索引以查找与索引条件匹配的行的位置,然后上层计划节点实际从表本身获取这些行。单独获取行比顺序读取它们昂贵得多,但是由于不必访问表的全部页,因此仍然比顺序扫描便宜。(使用两个计划级别的原因是,上层计划节点在读取行位置之前,根据物理顺序对索引标识的行位置进行排序,以最小化单独获取的成本。节点名称中提到的“位图”是进行排序的机制。)
Here the planner has decided to use a two-step plan: the child plan node visits an index to find the locations of rows matching the index condition, and then the upper plan node actually fetches those rows from the table itself. Fetching rows separately is much more expensive than reading them sequentially, but because not all the pages of the table have to be visited, this is still cheaper than a sequential scan. (The reason for using two plan levels is that the upper plan node sorts the row locations identified by the index into physical order before reading them, to minimize the cost of separate fetches. The “bitmap” mentioned in the node names is the mechanism that does the sorting.)
现在,我们向 WHERE 子句添加另一个条件:
Now let’s add another condition to the WHERE clause:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND stringu1 = 'xxx';
QUERY PLAN
------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=5.04..229.43 rows=1 width=244)
Recheck Cond: (unique1 < 100)
Filter: (stringu1 = 'xxx'::name)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
附加条件 stringu1 = 'xxx' 减少了输出行数估计值,但没有降低成本,因为我们仍然必须访问同一行集。请注意,stringu1 子句不能作为索引条件应用,因为此索引仅在 unique1 列上。相反,它作为对索引检索的行的一个筛选器应用。因此,成本实际上略有上升,以反映此额外检查。
The added condition stringu1 = 'xxx' reduces the output row count estimate, but not the cost because we still have to visit the same set of rows. Notice that the stringu1 clause cannot be applied as an index condition, since this index is only on the unique1 column. Instead it is applied as a filter on the rows retrieved by the index. Thus the cost has actually gone up slightly to reflect this extra checking.
在某些情况下,计划程序将更喜欢一个“简单”索引扫描计划:
In some cases the planner will prefer a “simple” index scan plan:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 = 42;
QUERY PLAN
-----------------------------------------------------------------------------
Index Scan using tenk1_unique1 on tenk1 (cost=0.29..8.30 rows=1 width=244)
Index Cond: (unique1 = 42)
在此计划类型的表中,行按索引顺序获取,这使得它们更难于读取,但是有少数值,整理行位置的额外成本是不值得的。您通常会看到只获取一行查询的这种计划类型。它还经常用于具有 ORDER BY 与索引顺序匹配的条件的查询,因为那时不需要额外的排序步骤来满足 ORDER BY。在此示例中,添加 ORDER BY unique1 将使用相同的计划,因为索引已隐式地提供了所要求的排序。
In this type of plan the table rows are fetched in index order, which makes them even more expensive to read, but there are so few that the extra cost of sorting the row locations is not worth it. You’ll most often see this plan type for queries that fetch just a single row. It’s also often used for queries that have an ORDER BY condition that matches the index order, because then no extra sorting step is needed to satisfy the ORDER BY. In this example, adding ORDER BY unique1 would use the same plan because the index already implicitly provides the requested ordering.
计划程序可以通过多种方式实现 ORDER BY 子句。上面的示例显示,此排序子句可以隐式实现。计划程序还可以添加一个明确的 sort 步骤:
The planner may implement an ORDER BY clause in several ways. The above example shows that such an ordering clause may be implemented implicitly. The planner may also add an explicit sort step:
EXPLAIN SELECT * FROM tenk1 ORDER BY unique1;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=1109.39..1134.39 rows=10000 width=244)
Sort Key: unique1
-> Seq Scan on tenk1 (cost=0.00..445.00 rows=10000 width=244)
如果计划的一部分保证对所需排序键的前缀进行排序,那么计划程序可能会决定使用 incremental sort 步骤:
If a part of the plan guarantees an ordering on a prefix of the required sort keys, then the planner may instead decide to use an incremental sort step:
EXPLAIN SELECT * FROM tenk1 ORDER BY four, ten LIMIT 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Limit (cost=521.06..538.05 rows=100 width=244)
-> Incremental Sort (cost=521.06..2220.95 rows=10000 width=244)
Sort Key: four, ten
Presorted Key: four
-> Index Scan using index_tenk1_on_four on tenk1 (cost=0.29..1510.08 rows=10000 width=244)
与常规分类相比,增量分类可以在整个结果集排序完成之前返回元组,这能实现 LIMIT 查询的优化。它能减少内存使用量并降低排序溢出到磁盘的可能性,但以将结果集拆分成多个排序批次为代价。
Compared to regular sorts, sorting incrementally allows returning tuples before the entire result set has been sorted, which particularly enables optimizations with LIMIT queries. It may also reduce memory usage and the likelihood of spilling sorts to disk, but it comes at the cost of the increased overhead of splitting the result set into multiple sorting batches.
如果在 WHERE 中引用的多个列上具有单独的索引,规划器可能选择使用索引的 AND 或 OR 组合:
If there are separate indexes on several of the columns referenced in WHERE, the planner might choose to use an AND or OR combination of the indexes:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=25.08..60.21 rows=10 width=244)
Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
-> BitmapAnd (cost=25.08..25.08 rows=10 width=0)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0)
Index Cond: (unique2 > 9000)
但这需要遍历两个索引,因此它不一定优于只使用一个索引并将其他条件视为一个筛选器。如果你改变所涉及的范围,你将看到计划相应地发生改变。
But this requires visiting both indexes, so it’s not necessarily a win compared to using just one index and treating the other condition as a filter. If you vary the ranges involved you’ll see the plan change accordingly.
以下是一个显示 LIMIT 影响的示例:
Here is an example showing the effects of LIMIT:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.29..14.48 rows=2 width=244)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..71.27 rows=10 width=244)
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
这与上面的查询相同,但我们添加了一个 LIMIT,这样不必检索所有行,并且规划器改变了它的处理方式。请注意,索引扫描节点的总成本和行数显示得如同它被执行到完成一样。然而,预计 Limit 节点在仅检索了这些行中的五分之一后就停止,因此它的总成本只有五分之一,而这正是查询的实际估计成本。这个计划比向之前的计划添加 Limit 节点更可取,因为 Limit 无法避免启动位图扫描的成本,因此这种方法的总成本将略高于 25 个单位。
This is the same query as above, but we added a LIMIT so that not all the rows need be retrieved, and the planner changed its mind about what to do. Notice that the total cost and row count of the Index Scan node are shown as if it were run to completion. However, the Limit node is expected to stop after retrieving only a fifth of those rows, so its total cost is only a fifth as much, and that’s the actual estimated cost of the query. This plan is preferred over adding a Limit node to the previous plan because the Limit could not avoid paying the startup cost of the bitmap scan, so the total cost would be something over 25 units with that approach.
让我们使用我们讨论过的列连接两个表:
Let’s try joining two tables, using the columns we have been discussing:
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
--------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244)
Index Cond: (unique2 = t1.unique2)
在该计划中,我们有一个嵌套循环连接节点,以两个表扫描作为输入或子节点。节点摘要行的缩进反映了计划树结构。连接的第一个或“外部”子节点是一个位图扫描,类似于我们之前看到的那些。它的成本和行计数与从 SELECT … WHERE unique1 < 10 获得的相同,因为我们在该节点应用 WHERE 子句 unique1 < 10 。 t1.unique2 = t2.unique2 子句目前还无关紧要,因此它不会影响外部扫描的行计数。嵌套循环连接节点将针对从外部子节点获得的每一行运行其第二个或“内部”子节点。当前外部行中的列值可以插入到内部扫描中;此处可使用外部行中的 t1.unique2 值,因此我们获得了一个计划和成本,与我们上面在简单 SELECT … WHERE t2.unique2 = _constant_ 案例中看到的类似。(预计的成本实际上比上述成本低一点,这是由于预计在 t2 上进行重复索引扫描时会发生缓存。)然后,循环节点的成本基于外部扫描的成本、再加外部行中的每一行的内部扫描重复执行(此处为 10 * 7.91),以及再加上用于处理连接的一点 CPU 时间。
In this plan, we have a nested-loop join node with two table scans as inputs, or children. The indentation of the node summary lines reflects the plan tree structure. The join’s first, or “outer”, child is a bitmap scan similar to those we saw before. Its cost and row count are the same as we’d get from SELECT … WHERE unique1 < 10 because we are applying the WHERE clause unique1 < 10 at that node. The t1.unique2 = t2.unique2 clause is not relevant yet, so it doesn’t affect the row count of the outer scan. The nested-loop join node will run its second, or “inner” child once for each row obtained from the outer child. Column values from the current outer row can be plugged into the inner scan; here, the t1.unique2 value from the outer row is available, so we get a plan and costs similar to what we saw above for a simple SELECT … WHERE t2.unique2 = _constant_ case. (The estimated cost is actually a bit lower than what was seen above, as a result of caching that’s expected to occur during the repeated index scans on t2.) The costs of the loop node are then set on the basis of the cost of the outer scan, plus one repetition of the inner scan for each outer row (10 * 7.91, here), plus a little CPU time for join processing.
在此示例中,连接的输出行数与两个扫描的行数的乘积相同,但在所有案例都不为真,因为可能存在其他 WHERE 子句同时提及两个表,因此只能应用于连接点,而不能应用于任一输入扫描。这里有一个示例:
In this example the join’s output row count is the same as the product of the two scans' row counts, but that’s not true in all cases because there can be additional WHERE clauses that mention both tables and so can only be applied at the join point, not to either input scan. Here’s an example:
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t2.unique2 < 10 AND t1.hundred < t2.hundred;
QUERY PLAN
---------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..49.46 rows=33 width=488)
Join Filter: (t1.hundred < t2.hundred)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0)
Index Cond: (unique1 < 10)
-> Materialize (cost=0.29..8.51 rows=10 width=244)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..8.46 rows=10 width=244)
Index Cond: (unique2 < 10)
条件 t1.hundred < t2.hundred 无法在 tenk2_unique2 索引中进行测试,因此它在连接节点处应用。这减少了连接节点的估计输出行数,但不会改变任一输入扫描。
The condition t1.hundred < t2.hundred can’t be tested in the tenk2_unique2 index, so it’s applied at the join node. This reduces the estimated output row count of the join node, but does not change either input scan.
请注意,此处规划器已选择通过在连接的内部关系上放置一个 Materialize 计划节点来“实现化”连接的内部关系。这意味着只执行一次 t2 索引扫描,即使嵌套循环连接节点需要读取该数据十次,每次读取外部关系中的一行。Materialize 节点在读取时将其数据保存在内存中,并在每次后续传递时从内存中返回数据。
Notice that here the planner has chosen to “materialize” the inner relation of the join, by putting a Materialize plan node atop it. This means that the t2 index scan will be done just once, even though the nested-loop join node needs to read that data ten times, once for each row from the outer relation. The Materialize node saves the data in memory as it’s read, and then returns the data from memory on each subsequent pass.
当处理外部连接时,你可能会看到连接计划节点附有“Join Filter”(连接筛选器)和普通“Filter”(筛选器)条件。连接筛选器条件来自外部连接的 ON 子句,因此不满足连接筛选器条件的行仍然会被作为 null 扩展行发出。但普通筛选器条件在外部连接规则之后应用,因此可以无条件地去除行。在内连接中,这些类型的筛选器之间没有语义差异。
When dealing with outer joins, you might see join plan nodes with both “Join Filter” and plain “Filter” conditions attached. Join Filter conditions come from the outer join’s ON clause, so a row that fails the Join Filter condition could still get emitted as a null-extended row. But a plain Filter condition is applied after the outer-join rules and so acts to remove rows unconditionally. In an inner join there is no semantic difference between these types of filters.
如果我们稍微改变查询的选择性,我们可能会得到一个非常不同的连接计划:
If we change the query’s selectivity a bit, we might get a very different join plan:
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
Hash Join (cost=230.47..713.98 rows=101 width=488)
Hash Cond: (t2.unique2 = t1.unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244)
-> Hash (cost=229.20..229.20 rows=101 width=244)
-> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
此处,规划器选择使用哈希连接,其中一个表中的行被输入到内存哈希表中,此后扫描另一个表并在哈希表中探查与每一行匹配的项。再次请注意缩进如何反映计划结构:tenk1 上的位图扫描是 Hash 节点的输入,该节点构造哈希表。然后将其返回到 Hash Join 节点,该节点读取其外部子节点计划中的行并为每一行搜索哈希表。
Here, the planner has chosen to use a hash join, in which rows of one table are entered into an in-memory hash table, after which the other table is scanned and the hash table is probed for matches to each row. Again note how the indentation reflects the plan structure: the bitmap scan on tenk1 is the input to the Hash node, which constructs the hash table. That’s then returned to the Hash Join node, which reads rows from its outer child plan and searches the hash table for each one.
另一种可能的连接类型是合并连接,此处对此进行了说明:
Another possible type of join is a merge join, illustrated here:
EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
Merge Join (cost=198.11..268.19 rows=10 width=488)
Merge Cond: (t1.unique2 = t2.unique2)
-> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244)
Filter: (unique1 < 100)
-> Sort (cost=197.83..200.33 rows=1000 width=244)
Sort Key: t2.unique2
-> Seq Scan on onek t2 (cost=0.00..148.00 rows=1000 width=244)
合并连接要求其输入数据按连接键进行排序。在此计划中,使用索引扫描按正确顺序访问行来对 tenk1 数据进行排序,但对 onek 而言,顺序扫描和排序更可取,因为在该表中有更多行需要访问。(顺序扫描和排序经常胜过对许多行进行排序的索引扫描,这是因为索引扫描需要非顺序磁盘访问。)
Merge join requires its input data to be sorted on the join keys. In this plan the tenk1 data is sorted by using an index scan to visit the rows in the correct order, but a sequential scan and sort is preferred for onek, because there are many more rows to be visited in that table. (Sequential-scan-and-sort frequently beats an index scan for sorting many rows, because of the nonsequential disk access required by the index scan.)
查看备用计划的一种方法是强制规划器忽略它认为最便宜的策略,方法是使用 Section 20.7.1 中描述的启用/禁用标志。(这是一个粗略的工具,但很有用。另请参阅 Section 14.3。)例如,如果我们不相信顺序扫描和排序是处理前一个示例中的表 onek 的最佳方式,我们可以尝试
One way to look at variant plans is to force the planner to disregard whatever strategy it thought was the cheapest, using the enable/disable flags described in Section 20.7.1. (This is a crude tool, but useful. See also Section 14.3.) For example, if we’re unconvinced that sequential-scan-and-sort is the best way to deal with table onek in the previous example, we could try
SET enable_sort = off;
EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
Merge Join (cost=0.56..292.65 rows=10 width=488)
Merge Cond: (t1.unique2 = t2.unique2)
-> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244)
Filter: (unique1 < 100)
-> Index Scan using onek_unique2 on onek t2 (cost=0.28..224.79 rows=1000 width=244)
这显示出规划器认为按索引扫描对 onek 进行排序大约比顺序扫描和排序贵 12%。当然,下一个问题是它是否正确。我们可以使用 EXPLAIN ANALYZE 调查这一点,如下所述。
which shows that the planner thinks that sorting onek by index-scanning is about 12% more expensive than sequential-scan-and-sort. Of course, the next question is whether it’s right about that. We can investigate that using EXPLAIN ANALYZE, as discussed below.
14.1.2. EXPLAIN ANALYZE #
可以使用 EXPLAIN 的 ANALYZE 选项检查规划器估计的准确性。使用此选项,EXPLAIN 实际上会执行查询,然后显示每个计划节点内累积的真实行数和真实运行时间,以及与普通 EXPLAIN 显示的相同估计。例如,我们可能会得到这样的结果:
It is possible to check the accuracy of the planner’s estimates by using EXPLAIN's ANALYZE option. With this option, EXPLAIN actually executes the query, and then displays the true row counts and true run time accumulated within each plan node, along with the same estimates that a plain EXPLAIN shows. For example, we might get a result like this:
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
请注意,“实际时间”值以实际时间的毫秒为单位,而 cost 估计则表示为任意单位;因此它们不太可能匹配。通常需要最重要的是估计行数是否与现实情况相当。在此示例中,估计全部准确,但在实践中这种情况非常罕见。
Note that the “actual time” values are in milliseconds of real time, whereas the cost estimates are expressed in arbitrary units; so they are unlikely to match up. The thing that’s usually most important to look for is whether the estimated row counts are reasonably close to reality. In this example the estimates were all dead-on, but that’s quite unusual in practice.
在一些查询计划中,子计划节点有可能被执行多次。例如,在上述的嵌套循环计划中,内部索引扫描将在外部行中被执行一次。在这样的情况下,loops 值报告该节点的总执行次数,以及所显示的实际时间和行值是每次执行的平均值。执行此操作是为了使这些数字与显示成本估算值的方式相匹配。通过 loops 值来获取实际花费在该节点中的总时间。在上一个示例中,我们花费了总共 0.220 毫秒来执行 tenk2 上的索引扫描。
In some query plans, it is possible for a subplan node to be executed more than once. For example, the inner index scan will be executed once per outer row in the above nested-loop plan. In such cases, the loops value reports the total number of executions of the node, and the actual time and rows values shown are averages per-execution. This is done to make the numbers comparable with the way that the cost estimates are shown. Multiply by the loops value to get the total time actually spent in the node. In the above example, we spent a total of 0.220 milliseconds executing the index scans on tenk2.
在某些情况下,EXPLAIN ANALYZE 显示额外执行统计信息,除了计划节点执行时间和行计数外。例如,排序和哈希节点提供额外信息:
In some cases EXPLAIN ANALYZE shows additional execution statistics beyond the plan node execution times and row counts. For example, Sort and Hash nodes provide extra information:
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=717.34..717.59 rows=101 width=488) (actual time=7.761..7.774 rows=100 loops=1)
Sort Key: t1.fivethous
Sort Method: quicksort Memory: 77kB
-> Hash Join (cost=230.47..713.98 rows=101 width=488) (actual time=0.711..7.427 rows=100 loops=1)
Hash Cond: (t2.unique2 = t1.unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.007..2.583 rows=10000 loops=1)
-> Hash (cost=229.20..229.20 rows=101 width=244) (actual time=0.659..0.659 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 28kB
-> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244) (actual time=0.080..0.526 rows=100 loops=1)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1)
Index Cond: (unique1 < 100)
Planning time: 0.194 ms
Execution time: 8.008 ms
排序节点显示所使用的排序方法(特别是排序是在内存中还是在磁盘上)以及所需的内存或磁盘空间量。哈希节点显示哈希桶和批处理的数量以及用于哈希表的峰值内存量。(如果批处理数量超过一个,还将涉及磁盘空间使用情况,但并未显示。)
The Sort node shows the sort method used (in particular, whether the sort was in-memory or on-disk) and the amount of memory or disk space needed. The Hash node shows the number of hash buckets and batches as well as the peak amount of memory used for the hash table. (If the number of batches exceeds one, there will also be disk space usage involved, but that is not shown.)
其他类型的额外信息是过滤器条件去除的行数:
Another type of extra information is the number of rows removed by a filter condition:
EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE ten < 7;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..483.00 rows=7000 width=244) (actual time=0.016..5.107 rows=7000 loops=1)
Filter: (ten < 7)
Rows Removed by Filter: 3000
Planning time: 0.083 ms
Execution time: 5.905 ms
这些计数对于在联接节点应用的过滤器条件尤其有价值。“去除的行”行只会在至少一行扫描行或在联接节点中的一个潜在联接对被过滤器条件拒绝时出现。
These counts can be particularly valuable for filter conditions applied at join nodes. The “Rows Removed” line only appears when at least one scanned row, or potential join pair in the case of a join node, is rejected by the filter condition.
相似于过滤器条件的情况是“丢失”索引扫描。例如,考虑这个搜索一个包含特定点的多边形:
A case similar to filter conditions occurs with “lossy” index scans. For example, consider this search for polygons containing a specific point:
EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
QUERY PLAN
------------------------------------------------------------------------------------------------------
Seq Scan on polygon_tbl (cost=0.00..1.05 rows=1 width=32) (actual time=0.044..0.044 rows=0 loops=1)
Filter: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Filter: 4
Planning time: 0.040 ms
Execution time: 0.083 ms
计划程序认为(相当正确)该示例表太小不必麻烦使用索引扫描,因此我们有一个普通序列扫描,其中所有行都被过滤器条件拒绝。但是,如果我们强制使用索引扫描,则会看到:
The planner thinks (quite correctly) that this sample table is too small to bother with an index scan, so we have a plain sequential scan in which all the rows got rejected by the filter condition. But if we force an index scan to be used, we see:
SET enable_seqscan TO off;
EXPLAIN ANALYZE SELECT * FROM polygon_tbl WHERE f1 @> polygon '(0.5,2.0)';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Index Scan using gpolygonind on polygon_tbl (cost=0.13..8.15 rows=1 width=32) (actual time=0.062..0.062 rows=0 loops=1)
Index Cond: (f1 @> '((0.5,2))'::polygon)
Rows Removed by Index Recheck: 1
Planning time: 0.034 ms
Execution time: 0.144 ms
这里,我们可以看到索引返回了一条候选行,然后被索引条件的重新检查拒绝。这种情况发生是因为 GiST 索引对于多边形包含测试而言是“丢失的”:它实际上返回了多边形与目标重叠的行,然后我们必须对这些行执行精确包含测试。
Here we can see that the index returned one candidate row, which was then rejected by a recheck of the index condition. This happens because a GiST index is “lossy” for polygon containment tests: it actually returns the rows with polygons that overlap the target, and then we have to do the exact containment test on those rows.
EXPLAIN 有一个 BUFFERS 选项,可与 ANALYZE 一起使用以获取更多运行时统计信息:
EXPLAIN has a BUFFERS option that can be used with ANALYZE to get even more run time statistics:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=25.08..60.21 rows=10 width=244) (actual time=0.323..0.342 rows=10 loops=1)
Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
Buffers: shared hit=15
-> BitmapAnd (cost=25.08..25.08 rows=10 width=0) (actual time=0.309..0.309 rows=0 loops=1)
Buffers: shared hit=7
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.043..0.043 rows=100 loops=1)
Index Cond: (unique1 < 100)
Buffers: shared hit=2
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.227..0.227 rows=999 loops=1)
Index Cond: (unique2 > 9000)
Buffers: shared hit=5
Planning time: 0.088 ms
Execution time: 0.423 ms
BUFFERS 提供的数字有助于确定查询的哪些部分是 I/O 密集的。
The numbers provided by BUFFERS help to identify which parts of the query are the most I/O-intensive.
请记住,由于 EXPLAIN ANALYZE 实际上运行了该查询,因此即使放弃任何查询可能输出的结果以打印 EXPLAIN 数据,任何副作用仍将像往常一样发生。如果您想分析数据修改查询而不更改您的表,可以在之后回滚该命令,例如:
Keep in mind that because EXPLAIN ANALYZE actually runs the query, any side-effects will happen as usual, even though whatever results the query might output are discarded in favor of printing the EXPLAIN data. If you want to analyze a data-modifying query without changing your tables, you can roll the command back afterwards, for example:
BEGIN;
EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Update on tenk1 (cost=5.08..230.08 rows=0 width=0) (actual time=3.791..3.792 rows=0 loops=1)
-> Bitmap Heap Scan on tenk1 (cost=5.08..230.08 rows=102 width=10) (actual time=0.069..0.513 rows=100 loops=1)
Recheck Cond: (unique1 < 100)
Heap Blocks: exact=90
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.05 rows=102 width=0) (actual time=0.036..0.037 rows=300 loops=1)
Index Cond: (unique1 < 100)
Planning Time: 0.113 ms
Execution Time: 3.850 ms
ROLLBACK;
如该示例所示,当查询是 INSERT、UPDATE、DELETE 或 MERGE 命令时,应用表更改的实际工作是由顶级插入、更新、删除或合并计划节点完成的。此节点下方的计划节点执行查找旧行和/或计算新数据的任务。因此,在上文中,我们看到了我们已经看到过的相同的位图表扫描,并且其输出被馈送到存储已更新行的更新节点。值得注意的是,尽管数据修改节点可能占用大量运行时间(在此,它消耗了大部分时间),但计划程序当前不会向成本估算中添加任何内容来解决该问题。这是因为对于每个正确查询计划来说,要完成的工作都是相同的,所以它不会影响计划决策。
As seen in this example, when the query is an INSERT, UPDATE, DELETE, or MERGE command, the actual work of applying the table changes is done by a top-level Insert, Update, Delete, or Merge plan node. The plan nodes underneath this node perform the work of locating the old rows and/or computing the new data. So above, we see the same sort of bitmap table scan we’ve seen already, and its output is fed to an Update node that stores the updated rows. It’s worth noting that although the data-modifying node can take a considerable amount of run time (here, it’s consuming the lion’s share of the time), the planner does not currently add anything to the cost estimates to account for that work. That’s because the work to be done is the same for every correct query plan, so it doesn’t affect planning decisions.
当 UPDATE、DELETE 或 MERGE 命令影响继承层次时,输出可能如下所示:
When an UPDATE, DELETE, or MERGE command affects an inheritance hierarchy, the output might look like this:
EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Update on parent (cost=0.00..24.59 rows=0 width=0)
Update on parent parent_1
Update on child1 parent_2
Update on child2 parent_3
Update on child3 parent_4
-> Result (cost=0.00..24.59 rows=4 width=14)
-> Append (cost=0.00..24.54 rows=4 width=14)
-> Seq Scan on parent parent_1 (cost=0.00..0.00 rows=1 width=14)
Filter: (f1 = 101)
-> Index Scan using child1_pkey on child1 parent_2 (cost=0.15..8.17 rows=1 width=14)
Index Cond: (f1 = 101)
-> Index Scan using child2_pkey on child2 parent_3 (cost=0.15..8.17 rows=1 width=14)
Index Cond: (f1 = 101)
-> Index Scan using child3_pkey on child3 parent_4 (cost=0.15..8.17 rows=1 width=14)
Index Cond: (f1 = 101)
在此示例中,更新节点需要考虑三个子表以及最初提到的父表。因此有四个输入扫描子计划,每个表一个。为清楚起见,更新节点被注释以显示将更新的特定目标表,与相应的子计划的顺序相同。
In this example the Update node needs to consider three child tables as well as the originally-mentioned parent table. So there are four input scanning subplans, one per table. For clarity, the Update node is annotated to show the specific target tables that will be updated, in the same order as the corresponding subplans.
EXPLAIN ANALYZE 所显示的 Planning time 是从分析查询和对其优化中生成查询计划所需的时间。不包括分析或重写。
The Planning time shown by EXPLAIN ANALYZE is the time it took to generate the query plan from the parsed query and optimize it. It does not include parsing or rewriting.
EXPLAIN ANALYZE 所显示的 Execution time 包括执行器启动和关闭时间,以及运行任何触发的运行时间,但它不包括分析、重写或计划时间。执行 BEFORE 触发器(如果有的)所花费的时间包含在相关插入、更新或删除节点的时间中;但是,执行 AFTER 触发器的所花费的时间不会计算在内,因为 AFTER 触发器是在整个计划完成后触发的。每个触发器(BEFORE 或 AFTER)中花费的总时间也会单独显示。请注意,延迟约束触发器将不会在事务结束之前执行,因此 EXPLAIN ANALYZE 根本不会考虑它们。
The Execution time shown by EXPLAIN ANALYZE includes executor start-up and shut-down time, as well as the time to run any triggers that are fired, but it does not include parsing, rewriting, or planning time. Time spent executing BEFORE triggers, if any, is included in the time for the related Insert, Update, or Delete node; but time spent executing AFTER triggers is not counted there because AFTER triggers are fired after completion of the whole plan. The total time spent in each trigger (either BEFORE or AFTER) is also shown separately. Note that deferred constraint triggers will not be executed until end of transaction and are thus not considered at all by EXPLAIN ANALYZE.
14.1.3. Caveats #
通过 EXPLAIN ANALYZE 测量的运行时间与正常执行相同查询的方式有两种重大偏差。首先,由于不会将任何输出行传递给客户端,因此不包括网络传输成本和 I/O 转换成本。其次, EXPLAIN ANALYZE 添加的测量开销可能是巨大的,尤其是在运行速度较慢的 gettimeofday() 操作系统调用的机器上。可以使用 pg_test_timing 工具测量系统上定时的开销。
There are two significant ways in which run times measured by EXPLAIN ANALYZE can deviate from normal execution of the same query. First, since no output rows are delivered to the client, network transmission costs and I/O conversion costs are not included. Second, the measurement overhead added by EXPLAIN ANALYZE can be significant, especially on machines with slow gettimeofday() operating-system calls. You can use the pg_test_timing tool to measure the overhead of timing on your system.
EXPLAIN 的结果不应推断到与您实际正在测试的情况有很大差异的情况;例如,不能假设玩具大小表的数据适用于大表。对计划程序的成本估算是非线性的,因此它可能会为更大或更小的表选择不同的计划。一个极端的例子是,对于仅占用一个磁盘页的表,无论索引是否可用,您几乎总是都会得到一个顺序扫描计划。计划程序意识到,在任何情况下,它都将读取一个磁盘页来处理该表,因此花费额外的页读取来查看索引没有价值。(我们在上面的 polygon_tbl 示例中看到了这种情况。)
EXPLAIN results should not be extrapolated to situations much different from the one you are actually testing; for example, results on a toy-sized table cannot be assumed to apply to large tables. The planner’s cost estimates are not linear and so it might choose a different plan for a larger or smaller table. An extreme example is that on a table that only occupies one disk page, you’ll nearly always get a sequential scan plan whether indexes are available or not. The planner realizes that it’s going to take one disk page read to process the table in any case, so there’s no value in expending additional page reads to look at an index. (We saw this happening in the polygon_tbl example above.)
在实际值和估计值不匹配时,但实际上并没有什么问题。当计划节点执行被 LIMIT 或类似效果中断时,就会发生这种情况。例如,在之前我们使用过的 LIMIT 查询中,
There are cases in which the actual and estimated values won’t match up well, but nothing is really wrong. One such case occurs when plan node execution is stopped short by a LIMIT or similar effect. For example, in the LIMIT query we used before,
EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..14.71 rows=2 width=244) (actual time=0.177..0.249 rows=2 loops=1)
-> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..72.42 rows=10 width=244) (actual time=0.174..0.244 rows=2 loops=1)
Index Cond: (unique2 > 9000)
Filter: (unique1 < 100)
Rows Removed by Filter: 287
Planning time: 0.096 ms
Execution time: 0.336 ms
索引扫描节点的预估成本和行记数会显示为完成运行时的状态。但实际上,Limit 节点在获取两行后便停止请求行,因此实际行记数仅为 2,而运行时间也小于成本预估所建议的时间。这并非预估错误,只是预估值和真实值显示方式上的差异。
the estimated cost and row count for the Index Scan node are shown as though it were run to completion. But in reality the Limit node stopped requesting rows after it got two, so the actual row count is only 2 and the run time is less than the cost estimate would suggest. This is not an estimation error, only a discrepancy in the way the estimates and true values are displayed.
合并连接也有一些测量伪影可能会让粗心大意的人感到困惑。如果合并连接用尽了另一个输入,且一个输入中的下一个键值大于另一个输入中的最后一个键值,则合并连接将停止读取一个输入。在这种情况下,不会再有匹配项,因此不需要扫描第一个输入的其余部分。这会导致未读取某个子对象的全部内容,结果会像 LIMIT 中提到的那些结果。此外,如果外部(第一个)子对象包含键值重复的行,则会备份和重新扫描内部(第二个)子对象,以找到与此键值匹配的一行。EXPLAIN ANALYZE 会将同一内部行的这些重复显示算作实际附加行。当有许多外部重复值时,内部子计划节点报告的实际行记数可能会明显大于内部关系中实际的行数。
Merge joins also have measurement artifacts that can confuse the unwary. A merge join will stop reading one input if it’s exhausted the other input and the next key value in the one input is greater than the last key value of the other input; in such a case there can be no more matches and so no need to scan the rest of the first input. This results in not reading all of one child, with results like those mentioned for LIMIT. Also, if the outer (first) child contains rows with duplicate key values, the inner (second) child is backed up and rescanned for the portion of its rows matching that key value. EXPLAIN ANALYZE counts these repeated emissions of the same inner rows as if they were real additional rows. When there are many outer duplicates, the reported actual row count for the inner child plan node can be significantly larger than the number of rows that are actually in the inner relation.
BitmapAnd 和 BitmapOr 节点由于实现限制,始终会将它们的实际行记数报告为零。
BitmapAnd and BitmapOr nodes always report their actual row counts as zero, due to implementation limitations.
通常, EXPLAIN 将显示计划程序创建的每个计划节点。但是,在某些情况下,执行器可以确定某些节点不必执行,因为它们无法生成任何行,具体取决于规划时不可用的参数值。(目前,这只能发生在扫描已分区表的 Append 或 MergeAppend 节点的子节点上。)当发生这种情况时,这些计划节点将从 EXPLAIN 输出中省略,而 Subplans Removed: _N_ 注释将取而代之。
Normally, EXPLAIN will display every plan node created by the planner. However, there are cases where the executor can determine that certain nodes need not be executed because they cannot produce any rows, based on parameter values that were not available at planning time. (Currently this can only happen for child nodes of an Append or MergeAppend node that is scanning a partitioned table.) When this happens, those plan nodes are omitted from the EXPLAIN output and a Subplans Removed: _N_ annotation appears instead.