Postgresql 中文操作指南

52.3. The Parser Stage #

parser stage 包含两部分:

The parser stage consists of two parts:

52.3.1. Parser #

解析器必须检查查询字符串(它以纯文本形式出现)是否存在有效的语法。如果语法正确,则建立一个 parse tree 并将其返回;否则,则返回一个错误。使用著名的 Unix 工具 bison 和 flex 实现解析器和词法分析器。

The parser has to check the query string (which arrives as plain text) for valid syntax. If the syntax is correct a parse tree is built up and handed back; otherwise an error is returned. The parser and lexer are implemented using the well-known Unix tools bison and flex.

lexer 在文件 scan.l 中定义,并且负责识别 identifiersSQL key words 等等。对于找到的每一个关键字或标识符,都会生成一个 token 并将其传递到解析器。

The lexer is defined in the file scan.l and is responsible for recognizing identifiers, the SQL key words etc. For every key word or identifier that is found, a token is generated and handed to the parser.

解析器在文件 gram.y 中定义,并且包含一组 grammar rulesactions,每当某个规则被触发时,它们就会被执行。操作码(实际上是 C 代码)用于建立解析树。

The parser is defined in the file gram.y and consists of a set of grammar rules and actions that are executed whenever a rule is fired. The code of the actions (which is actually C code) is used to build up the parse tree.

将文件 scan.l 使用程序 flex 转换为 C 源代码 scan.c,并将 gram.y 使用 bison 转换为 gram.c。在完成这些转换后,可使用普通 C 编译器来创建解析器。切勿对生成的 C 文件进行任何更改,因为它们将在下次调用 flex 或 bison 时被覆盖。

The file scan.l is transformed to the C source file scan.c using the program flex and gram.y is transformed to gram.c using bison. After these transformations have taken place a normal C compiler can be used to create the parser. Never make any changes to the generated C files as they will be overwritten the next time flex or bison is called.

Note

提到的转换和编译通常会使用 PostgreSQL 源代码发行版附带的 makefiles 自动完成。

The mentioned transformations and compilations are normally done automatically using the makefiles shipped with the PostgreSQL source distribution.

对 bison 或 gram.y 中给出的语法规则的详细说明超出了本文档的范围。有很多书籍和文档涉及 flex 和 bison。在你开始学习 gram.y 中给出的语法之前,你应该熟悉 bison,否则你将无法理解其中的内容。

A detailed description of bison or the grammar rules given in gram.y would be beyond the scope of this manual. There are many books and documents dealing with flex and bison. You should be familiar with bison before you start to study the grammar given in gram.y otherwise you won’t understand what happens there.

52.3.2. Transformation Process #

解析器阶段仅使用有关 SQL 语法结构的固定规则创建解析树。它不会在系统目录中进行任何查找,因此不可能了解所请求操作的详细语义。解析器完成后,transformation process 将解析器返送的树作为输入并进行语义解释,以了解查询引用的哪些表、函数和运算符。用于表示此信息的 datastructure 称为 query tree

The parser stage creates a parse tree using only fixed rules about the syntactic structure of SQL. It does not make any lookups in the system catalogs, so there is no possibility to understand the detailed semantics of the requested operations. After the parser completes, the transformation process takes the tree handed back by the parser as input and does the semantic interpretation needed to understand which tables, functions, and operators are referenced by the query. The data structure that is built to represent this information is called the query tree.

将原始解析与语义分析分开的理由是只有在事务内才能进行系统目录查找,我们不希望在收到查询字符串时立即启动事务。原始解析阶段足以识别事务控制命令(BEGINROLLBACK 等),然后可以正确执行这些命令,而无需进一步分析。一旦我们知道要处理的是实际查询(例如 SELECTUPDATE),如果我们不处于事务中,则可以启动事务。只有这样才能调用转换过程。

The reason for separating raw parsing from semantic analysis is that system catalog lookups can only be done within a transaction, and we do not wish to start a transaction immediately upon receiving a query string. The raw parsing stage is sufficient to identify the transaction control commands (BEGIN, ROLLBACK, etc.), and these can then be correctly executed without any further analysis. Once we know that we are dealing with an actual query (such as SELECT or UPDATE), it is okay to start a transaction if we’re not already in one. Only then can the transformation process be invoked.

转换过程创建的查询树在大多数情况下在结构上类似于原始解析树,但它在细节上有很多不同。例如,解析树中的 FuncCall 节点表示看起来像函数调用的内容。这可能会转换为 FuncExprAggref 节点,具体取决于所引用的名称是普通函数还是聚合函数。此外,有关列和表达式结果的实际数据类型的信息也会添加到查询树中。

The query tree created by the transformation process is structurally similar to the raw parse tree in most places, but it has many differences in detail. For example, a FuncCall node in the parse tree represents something that looks syntactically like a function call. This might be transformed to either a FuncExpr or Aggref node depending on whether the referenced name turns out to be an ordinary function or an aggregate function. Also, information about the actual data types of columns and expression results is added to the query tree.