Postgresql 中文操作指南
41.7. Rules Versus Triggers #
可以使用触发器完成的很多事情也可以使用 PostgreSQL 规则系统实现。规则无法实现的一些内容是某些类型的约束,尤其是外键。可以放置一条限定规则,如果某一列的值未出现在另一表中,则将一条命令重写为 NOTHING。但之后数据将被静默地丢弃,这是一个不恰当的想法。如果要求对有效值进行检查,并且在无效值的情况下应生成一条错误消息,则必须通过触发器来执行此检查。
Many things that can be done using triggers can also be implemented using the PostgreSQL rule system. One of the things that cannot be implemented by rules are some kinds of constraints, especially foreign keys. It is possible to place a qualified rule that rewrites a command to NOTHING if the value of a column does not appear in another table. But then the data is silently thrown away and that’s not a good idea. If checks for valid values are required, and in the case of an invalid value an error message should be generated, it must be done by a trigger.
在本章中,我们重点关注了使用规则来更新视图。本章中的所有更新规则示例也可以使用 INSTEAD OF 触发器在视图中实现。编写此类触发器通常比编写规则更容易,尤其是在执行更新所需的复杂逻辑时。
In this chapter, we focused on using rules to update views. All of the update rule examples in this chapter can also be implemented using INSTEAD OF triggers on the views. Writing such triggers is often easier than writing rules, particularly if complex logic is required to perform the update.
对于可通过两者实现的事项,哪种方式更好取决于数据库的用法。触发器将针对每行受影响行触发一次。规则将修改查询或生成附加查询。因此,如果多行受一条语句的影响,则发出一个额外命令的规则可能比为每行调用且必须多次重新确定要执行的动作的触发器更快。但是,触发器方法在概念上远比规则方法更简单,并且对于新手来说更容易正确掌握。
For the things that can be implemented by both, which is best depends on the usage of the database. A trigger is fired once for each affected row. A rule modifies the query or generates an additional query. So if many rows are affected in one statement, a rule issuing one extra command is likely to be faster than a trigger that is called for every single row and must re-determine what to do many times. However, the trigger approach is conceptually far simpler than the rule approach, and is easier for novices to get right.
本文展示了一个有关在某种情况下如何进行规则与触发器之间选择的示例。有两个表:
Here we show an example of how the choice of rules versus triggers plays out in one situation. There are two tables:
CREATE TABLE computer (
hostname text, -- indexed
manufacturer text -- indexed
);
CREATE TABLE software (
software text, -- indexed
hostname text -- indexed
);
这两个表都有很多千行,而 hostname 上的索引是唯一的。规则或触发器应实现一条约束,删除引用已删除计算机的 software 中的行。触发器将使用此命令:
Both tables have many thousands of rows and the indexes on hostname are unique. The rule or trigger should implement a constraint that deletes rows from software that reference a deleted computer. The trigger would use this command:
DELETE FROM software WHERE hostname = $1;
由于每从 computer 中删除单独的行都会调用触发器,所以它可以准备并保存此命令的计划,并在参数中传递 hostname 值。规则可以这样编写:
Since the trigger is called for each individual row deleted from computer, it can prepare and save the plan for this command and pass the hostname value in the parameter. The rule would be written as:
CREATE RULE computer_del AS ON DELETE TO computer
DO DELETE FROM software WHERE hostname = OLD.hostname;
现在我们来看不同类型的删除。在以下情况下:
Now we look at different types of deletes. In the case of a:
DELETE FROM computer WHERE hostname = 'mypc.local.net';
表 computer 将通过索引(快速)扫描,而触发器发出的命令也将使用索引扫描(也很快)。规则添加的额外命令是:
the table computer is scanned by index (fast), and the command issued by the trigger would also use an index scan (also fast). The extra command from the rule would be:
DELETE FROM software WHERE computer.hostname = 'mypc.local.net'
AND software.hostname = computer.hostname;
由于已设置适当的索引,所以规划器将创建一个
Since there are appropriate indexes set up, the planner will create a plan of
Nestloop
-> Index Scan using comp_hostidx on computer
-> Index Scan using soft_hostidx on software
因此,在触发器和规则实现之间的速度差异不会太大。
So there would be not that much difference in speed between the trigger and the rule implementation.
在下一个删除操作中,我们希望摆脱 hostname 以 old 开头的所有 2000 台计算机。有两个可能的命令可以执行此操作。一个是:
With the next delete we want to get rid of all the 2000 computers where the hostname starts with old. There are two possible commands to do that. One is:
DELETE FROM computer WHERE hostname >= 'old'
AND hostname < 'ole'
规则添加的命令将是:
The command added by the rule will be:
DELETE FROM software WHERE computer.hostname >= 'old' AND computer.hostname < 'ole'
AND software.hostname = computer.hostname;
计划是
with the plan
Hash Join
-> Seq Scan on software
-> Hash
-> Index Scan using comp_hostidx on computer
另一个可能的命令是:
The other possible command is:
DELETE FROM computer WHERE hostname ~ '^old';
它导致对规则添加的命令执行以下计划:
which results in the following executing plan for the command added by the rule:
Nestloop
-> Index Scan using comp_hostidx on computer
-> Index Scan using soft_hostidx on software
这表明,当存在多个限定表达式与 AND 结合使用时,规划器没有意识到,在 computer 中对 hostname 的限定也可用于在 software 上执行索引扫描,而这是规划器在命令的正则表达式版本中所执行的操作。对于必须删除的 2000 台旧计算机,将为每台计算机调用一次触发器,并且这将导致在 computer 上进行一次索引扫描,在 software 上进行 2000 次索引扫描。规则实现将使用两个使用索引的命令来执行此操作。并且规则在顺序扫描情况下是否仍然更快取决于表 software 的整体大小。尽管所有索引块都将很快就在缓存中,但通过 SPI 管理器从触发器执行 2000 条命令需要一些时间。
This shows, that the planner does not realize that the qualification for hostname in computer could also be used for an index scan on software when there are multiple qualification expressions combined with AND, which is what it does in the regular-expression version of the command. The trigger will get invoked once for each of the 2000 old computers that have to be deleted, and that will result in one index scan over computer and 2000 index scans over software. The rule implementation will do it with two commands that use indexes. And it depends on the overall size of the table software whether the rule will still be faster in the sequential scan situation. 2000 command executions from the trigger over the SPI manager take some time, even if all the index blocks will soon be in the cache.
我们查看的最后一个命令是:
The last command we look at is:
DELETE FROM computer WHERE manufacturer = 'bim';
同样,这可能会导致从 computer 中删除多行。因此,触发器将再次通过执行器运行多条命令。规则生成的命令将是:
Again this could result in many rows to be deleted from computer. So the trigger will again run many commands through the executor. The command generated by the rule will be:
DELETE FROM software WHERE computer.manufacturer = 'bim'
AND software.hostname = computer.hostname;
对于该命令的计划将再次是两个索引扫描上的嵌套循环,仅在 computer 上使用不同的索引:
The plan for that command will again be the nested loop over two index scans, only using a different index on computer:
Nestloop
-> Index Scan using comp_manufidx on computer
-> Index Scan using soft_hostidx on software
在任何这些情况下,规则系统中的额外命令将多少有些独立于命令中受影响的行数。
In any of these cases, the extra commands from the rule system will be more or less independent from the number of affected rows in a command.
总结是,只有当规则的操作导致大且查询条件不当的联接时,规则才会明显比触发器慢,即规划程序失败的情况。
The summary is, rules will only be significantly slower than triggers if their actions result in large and badly qualified joins, a situation where the planner fails.