Postgresql 中文操作指南

Chapter 58. Writing a Procedural Language Handler

对并非使用编译语言当前“版本 1”接口(包括用户自定义过程语言中的函数和用 SQL 编写的函数)编写的函数的所有调用都会转到对特定语言的 call handler 函数。调用处理程序负责以有意义的方式执行函数,例如,解释供应的源文本。此章节概述如何编写新的过程语言的调用处理程序。

All calls to functions that are written in a language other than the current “version 1” interface for compiled languages (this includes functions in user-defined procedural languages and functions written in SQL) go through a call handler function for the specific language. It is the responsibility of the call handler to execute the function in a meaningful way, such as by interpreting the supplied source text. This chapter outlines how a new procedural language’s call handler can be written.

过程语言的调用处理程序是一个“正常”函数,必须使用经过编译的语言(例如 C)和版本 1 界面编写,并将其注册到 PostgreSQL 中,作为不接受参数且返回类型 language_handler 的函数。此特殊伪类型将该函数标识为调用处理程序,并防止在 SQL 命令中直接对其进行调用。有关 C 语言调用约定和动态加载的更多详细信息,请参阅 Section 38.10

The call handler for a procedural language is a “normal” function that must be written in a compiled language such as C, using the version-1 interface, and registered with PostgreSQL as taking no arguments and returning the type language_handler. This special pseudo-type identifies the function as a call handler and prevents it from being called directly in SQL commands. For more details on C language calling conventions and dynamic loading, see Section 38.10.

调用调用处理程序的方式与调用任何其他函数的方式相同:它接收一个指向 FunctionCallInfoBaseData struct 的指针,其中包含参数值和有关被调用函数的信息,并且期望它返回一个 Datum 结果(以及可能设置 isnull 字段 FunctionCallInfoBaseData 结构,如果它希望返回一个 SQL null 结果)。调用处理程序和普通被调用函数之间的区别在于, FunctionCallInfoBaseData 结构的 flinfo→fn_oid 字段将包含要调用的实际函数的 OID,而不是调用处理程序本身的 OID。调用处理程序必须使用此字段来确定要执行哪个函数。另外,已根据目标函数而不是调用处理程序的声明设置了传递的参数列表。

The call handler is called in the same way as any other function: It receives a pointer to a FunctionCallInfoBaseData struct containing argument values and information about the called function, and it is expected to return a Datum result (and possibly set the isnull field of the FunctionCallInfoBaseData structure, if it wishes to return an SQL null result). The difference between a call handler and an ordinary callee function is that the flinfo→fn_oid field of the FunctionCallInfoBaseData structure will contain the OID of the actual function to be called, not of the call handler itself. The call handler must use this field to determine which function to execute. Also, the passed argument list has been set up according to the declaration of the target function, not of the call handler.

调用处理程序有责任从 pg_proc 系统目录中获取函数的条目并分析所调用函数的参数和返回类型。该函数的 CREATE FUNCTION 命令中的 AS 子句可以在 pg_proc 行的 prosrc 列中找到。这通常是过程语言中的源文本,但理论上它可以是其他内容,例如文件路径名或任何其他详细告诉调用处理程序要做什么的内容。

It’s up to the call handler to fetch the entry of the function from the pg_proc system catalog and to analyze the argument and return types of the called function. The AS clause from the CREATE FUNCTION command for the function will be found in the prosrc column of the pg_proc row. This is commonly source text in the procedural language, but in theory it could be something else, such as a path name to a file, or anything else that tells the call handler what to do in detail.

通常,在每个 SQL 语句中调用同一函数的次数很多。通过使用 flinfo→fn_extra 字段,调用处理程序可以避免重复查找有关所调用函数的信息。这最初为 NULL ,但可以由调用处理程序设置为指向有关所调用函数的信息。在后续调用中,如果 flinfo→fn_extra 已经为非 NULL ,则可以使用它并跳过信息查找步骤。调用处理程序必须确保 flinfo→fn_extra 指向至少存在到当前查询结束的内存,因为 FmgrInfo 数据结构可以保持很长时间。执行此操作的一种方法是在 flinfo→fn_mcxt 指定的内存上下文中分配额外数据;此类数据通常具有与 FmgrInfo 本身相同的寿命。但处理程序还可以选择使用寿命更长的内存上下文,以便跨查询缓存函数定义信息。

Often, the same function is called many times per SQL statement. A call handler can avoid repeated lookups of information about the called function by using the flinfo→fn_extra field. This will initially be NULL, but can be set by the call handler to point at information about the called function. On subsequent calls, if flinfo→fn_extra is already non-NULL then it can be used and the information lookup step skipped. The call handler must make sure that flinfo→fn_extra is made to point at memory that will live at least until the end of the current query, since an FmgrInfo data structure could be kept that long. One way to do this is to allocate the extra data in the memory context specified by flinfo→fn_mcxt; such data will normally have the same lifespan as the FmgrInfo itself. But the handler could also choose to use a longer-lived memory context so that it can cache function definition information across queries.

当过程语言函数被调用为触发器时,不会以通常的方式传递任何参数,但是 FunctionCallInfoBaseDatacontext 字段指向一个 TriggerData 结构,而不是 NULL ,如同在普通函数调用中一样。语言处理程序应该为过程语言函数提供获取触发器信息的方法。

When a procedural-language function is invoked as a trigger, no arguments are passed in the usual way, but the FunctionCallInfoBaseData's context field points at a TriggerData structure, rather than being NULL as it is in a plain function call. A language handler should provide mechanisms for procedural-language functions to get at the trigger information.

src/test/modules/plsample 中提供了一个用 C 扩展编写的过程语言处理程序模板。这是一个工作示例,演示了创建过程语言处理程序、处理参数和返回值的一种方法。

A template for a procedural-language handler written as a C extension is provided in src/test/modules/plsample. This is a working sample demonstrating one way to create a procedural-language handler, process parameters, and return a value.

尽管提供调用处理程序足以创建最小的过程语言,但还有两个其他函数可以根据需要提供以使该语言更易于使用。它们是 validatorinline handler 。可以提供一个验证器,以便在 CREATE FUNCTION 期间进行特定于语言的检查。可以提供一个内联处理程序,允许语言支持通过 DO 命令执行的匿名代码块。

Although providing a call handler is sufficient to create a minimal procedural language, there are two other functions that can optionally be provided to make the language more convenient to use. These are a validator and an inline handler. A validator can be provided to allow language-specific checking to be done during CREATE FUNCTION. An inline handler can be provided to allow the language to support anonymous code blocks executed via the DO command.

如果过程语言提供验证器,则必须将其声明为一个函数,该函数接收一个类型为 oid 的单个参数。验证器的结果被忽略,所以通常会声明它返回 void 。当 CREATE FUNCTION 命令创建或更新了用过程语言编写的函数时,将调用验证器。传递的 OID 是其过程语言的函数的 pg_proc 行。验证器必须以通常的方式获取此行,并进行任何适当的检查。首先,调用 CheckFunctionValidatorAccess() 以诊断用户无法通过 CREATE FUNCTION 实现对验证器的显式调用。典型的检查包括验证函数的参数和结果类型是否受语言支持,以及函数的主体在语言中是否在语法上正确。如果验证器发现函数没有问题,它应直接返回。如果它找到错误,它应该通过 ereport() 常规错误报告机制报告该错误。抛出错误将强制进行事务回滚,从而防止提交不正确的函数定义。

If a validator is provided by a procedural language, it must be declared as a function taking a single parameter of type oid. The validator’s result is ignored, so it is customarily declared to return void. The validator will be called at the end of a CREATE FUNCTION command that has created or updated a function written in the procedural language. The passed-in OID is the OID of the function’s pg_proc row. The validator must fetch this row in the usual way, and do whatever checking is appropriate. First, call CheckFunctionValidatorAccess() to diagnose explicit calls to the validator that the user could not achieve through CREATE FUNCTION. Typical checks then include verifying that the function’s argument and result types are supported by the language, and that the function’s body is syntactically correct in the language. If the validator finds the function to be okay, it should just return. If it finds an error, it should report that via the normal ereport() error reporting mechanism. Throwing an error will force a transaction rollback and thus prevent the incorrect function definition from being committed.

验证器函数通常应该遵守 check_function_bodies 参数:如果它关闭,则应该跳过任何昂贵或上下文相关的检查。如果该语言在编译时提供代码执行,则验证器必须禁止会引发此类执行的检查。特别是,pg_dump 会关闭此参数,以便它可以在不担心副作用或函数主体对其他数据库对象的依赖性的情况下加载过程语言函数。(由于此要求,调用处理程序应该避免假设验证器已完全检查函数。拥有验证器的意义不在于让调用处理程序省略检查,而是在 CREATE FUNCTION 命令中存在明显错误时立即通知用户。)尽管大多数只检查内容的选择都留给验证器函数自行决定,但请注意,核心的 CREATE FUNCTION 代码仅当 check_function_bodies 开启时执行附加到函数的 SET 子句。因此,当 check_function_bodies 关闭时,可能会受 GUC 参数影响的检查绝对应该跳过,以避免在还原转储时发生错误失败情况。

Validator functions should typically honor the check_function_bodies parameter: if it is turned off then any expensive or context-sensitive checking should be skipped. If the language provides for code execution at compilation time, the validator must suppress checks that would induce such execution. In particular, this parameter is turned off by pg_dump so that it can load procedural language functions without worrying about side effects or dependencies of the function bodies on other database objects. (Because of this requirement, the call handler should avoid assuming that the validator has fully checked the function. The point of having a validator is not to let the call handler omit checks, but to notify the user immediately if there are obvious errors in a CREATE FUNCTION command.) While the choice of exactly what to check is mostly left to the discretion of the validator function, note that the core CREATE FUNCTION code only executes SET clauses attached to a function when check_function_bodies is on. Therefore, checks whose results might be affected by GUC parameters definitely should be skipped when check_function_bodies is off, to avoid false failures when restoring a dump.

如果过程语言提供内联处理程序,则必须将其声明为一个函数,该函数接收一个类型为 internal 的单个参数。内联处理程序的结果被忽略,所以通常会声明它返回 void 。将 DO 语句指定过程语言时,将调用内联处理程序。实际传递的参数是一个指向 InlineCodeBlock 结构的指针,其中包含有关 DO 语句的参数的信息,特别是待执行的匿名代码块的文本。内联处理程序应该执行此代码并返回。

If an inline handler is provided by a procedural language, it must be declared as a function taking a single parameter of type internal. The inline handler’s result is ignored, so it is customarily declared to return void. The inline handler will be called when a DO statement is executed specifying the procedural language. The parameter actually passed is a pointer to an InlineCodeBlock struct, which contains information about the DO statement’s parameters, in particular the text of the anonymous code block to be executed. The inline handler should execute this code and return.

建议您将所有这些函数声明以及 CREATE LANGUAGE 命令本身包含在 extension 中,以便使用一个简单的 CREATE EXTENSION 命令便足以安装语言。有关编写扩展的信息,请参阅 Section 38.17

It’s recommended that you wrap all these function declarations, as well as the CREATE LANGUAGE command itself, into an extension so that a simple CREATE EXTENSION command is sufficient to install the language. See Section 38.17 for information about writing extensions.

标准发行版中包含的程序语言在尝试编写自己的语言处理程序时是很好的参考。查看源代码树的 src/pl 子目录。 CREATE LANGUAGE 参考页还提供了一些有用的详细信息。

The procedural languages included in the standard distribution are good references when trying to write your own language handler. Look into the src/pl subdirectory of the source tree. The CREATE LANGUAGE reference page also has some useful details.