Postgresql 中文操作指南

55.1. Overview #

该协议将启动过程和正常工作状态分开处理。在启动阶段,前端会打开与服务器的连接,然后根据服务器的要求对自身进行认证。(该过程可能只需要一条消息,或多条消息,具体取决于所使用的认证方法。)若一切顺利,服务器会将状态信息发送至前端,最后进入正常工作状态。除了最初的启动请求消息之外,协议的这一部分由服务器驱动。

The protocol has separate phases for startup and normal operation. In the startup phase, the frontend opens a connection to the server and authenticates itself to the satisfaction of the server. (This might involve a single message, or multiple messages depending on the authentication method being used.) If all goes well, the server then sends status information to the frontend, and finally enters normal operation. Except for the initial startup-request message, this part of the protocol is driven by the server.

在正常工作期间,前端会向后端发送查询和其他命令,而后端会发送查询结果和其他响应。在极少数情况下(例如 NOTIFY),后端将主动发送消息,但是在会话的状态下,这部分流程基本上是由前端请求驱动的。

During normal operation, the frontend sends queries and other commands to the backend, and the backend sends back query results and other responses. There are a few cases (such as NOTIFY) wherein the backend will send unsolicited messages, but for the most part this portion of a session is driven by frontend requests.

会议中断通常是前端选择的,但在一定情况下也可能因后端中止。在任何情况下,当后端关闭连接时,后端会在退出前回滚所有打开的(不完整)交易。

Termination of the session is normally by frontend choice, but can be forced by the backend in certain cases. In any case, when the backend closes the connection, it will roll back any open (incomplete) transaction before exiting.

在正常工作期间,可以通过两个子协议执行 SQL 命令。在 “简单查询” 协议中,前端只需要发送文本查询字符串,然后后端对其进行解析并立即执行。在 “扩展查询” 协议中,查询的处理被分为多个步骤:解析、参数值的绑定和执行。这样做可以提升灵活性并提高性能,但是会增加复杂性。

Within normal operation, SQL commands can be executed through either of two sub-protocols. In the “simple query” protocol, the frontend just sends a textual query string, which is parsed and immediately executed by the backend. In the “extended query” protocol, processing of queries is separated into multiple steps: parsing, binding of parameter values, and execution. This offers flexibility and performance benefits, at the cost of extra complexity.

正常工作期间还有额外用于特殊操作的子协议,例如 COPY

Normal operation has additional sub-protocols for special operations such as COPY.

55.1.1. Messaging Overview #

所有的通信都是通过消息流进行的。消息的第一个字节标识消息类型,接下来的四个字节表示消息其余部分的长度(长度计数包含本身,但不包含消息类型字节)。消息的剩余内容由消息类型决定。出于历史原因,客户端发送的第一条消息(启动消息)没有初始消息类型字节。

All communication is through a stream of messages. The first byte of a message identifies the message type, and the next four bytes specify the length of the rest of the message (this length count includes itself, but not the message-type byte). The remaining contents of the message are determined by the message type. For historical reasons, the very first message sent by the client (the startup message) has no initial message-type byte.

为了避免与消息流失去同步,服务器和客户端通常会将整个消息读入缓冲区(使用字节计数),然后再尝试处理其内容。这样,如果在处理内容时检测到错误,可以轻松地进行恢复。在极端情况下(例如没有足够内存来缓冲消息),接收方可以使用字节计数来确定在继续读取消息之前需要跳过多少输入。

To avoid losing synchronization with the message stream, both servers and clients typically read an entire message into a buffer (using the byte count) before attempting to process its contents. This allows easy recovery if an error is detected while processing the contents. In extreme situations (such as not having enough memory to buffer the message), the receiver can use the byte count to determine how much input to skip before it resumes reading messages.

相反,服务器和客户端都必须注意不要发送不完整的消息。通常可以通过在开始发送消息之前将整个消息编组到缓冲区中来完成此操作。如果在发送或接收消息的过程中发生通信故障,唯一合理的响应就是放弃连接,因为几乎没有办法恢复消息边界同步。

Conversely, both servers and clients must take care never to send an incomplete message. This is commonly done by marshaling the entire message in a buffer before beginning to send it. If a communications failure occurs partway through sending or receiving a message, the only sensible response is to abandon the connection, since there is little hope of recovering message-boundary synchronization.

55.1.2. Extended Query Overview #

在扩展查询协议中,SQL 命令的执行分为多个步骤。步骤之间保留的状态由两种类型的对象表示:prepared statementsportals。预处理语句表示解析和语义分析文本查询字符串的结果。预处理语句本身并不准备执行,因为它可能缺少 parameters 的具体值。门户表示一个准备执行或已经部分执行的语句,其填充了所有缺失的参数值。(对于 SELECT 语句,门户等同于打开的光标,但我们选择使用不同的术语,因为光标不处理非 SELECT 语句。)

In the extended-query protocol, execution of SQL commands is divided into multiple steps. The state retained between steps is represented by two types of objects: prepared statements and portals. A prepared statement represents the result of parsing and semantic analysis of a textual query string. A prepared statement is not in itself ready to execute, because it might lack specific values for parameters. A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in. (For SELECT statements, a portal is equivalent to an open cursor, but we choose to use a different term since cursors don’t handle non-SELECT statements.)

整个执行周期包含 parse 步骤,它从文本查询字符串创建预处理语句; bind 步骤,它使用预处理语句以及所需参数的值创建门户;还有一个 execute 步骤,它运行门户的查询。对于返回行的查询(SELECTSHOW 等),执行步骤可以设置仅获取有限数量的行,因此可能需要多个执行步骤才能完成操作。

The overall execution cycle consists of a parse step, which creates a prepared statement from a textual query string; a bind step, which creates a portal given a prepared statement and values for any needed parameters; and an execute step that runs a portal’s query. In the case of a query that returns rows (SELECT, SHOW, etc.), the execute step can be told to fetch only a limited number of rows, so that multiple execute steps might be needed to complete the operation.

后端可以跟踪多个预处理语句和门户(但请注意,这些语句和门户仅存在于会话中,并且绝不会在会话之间共享)。现有预处理语句和门户由在创建时分配的名称引用。此外,还存在一个 “未命名” 预处理语句和门户。尽管它们的行为在很大程度上与命名对象相同,但为了只执行一次查询然后丢弃查询的情况,它们的操作进行了优化,而对命名对象的优化基于多次使用的预期。

The backend can keep track of multiple prepared statements and portals (but note that these exist only within a session, and are never shared across sessions). Existing prepared statements and portals are referenced by names assigned when they were created. In addition, an “unnamed” prepared statement and portal exist. Although these behave largely the same as named objects, operations on them are optimized for the case of executing a query only once and then discarding it, whereas operations on named objects are optimized on the expectation of multiple uses.

55.1.3. Formats and Format Codes #

特定数据类型的数据可以以若干不同的 formats 传输。从 PostgreSQL 7.4 开始,唯一支持的格式是 “文本” 和 “二进制”,但协议为未来的扩展提供了保障。任何值的所需格式由 format code 指定。客户端可以为每个传输的参数值和查询结果的每一列指定格式代码。文本具有格式代码 0,二进制具有格式代码 1,所有其他格式代码都保留用于将来的定义。

Data of a particular data type might be transmitted in any of several different formats. As of PostgreSQL 7.4 the only supported formats are “text” and “binary”, but the protocol makes provision for future extensions. The desired format for any value is specified by a format code. Clients can specify a format code for each transmitted parameter value and for each column of a query result. Text has format code zero, binary has format code one, and all other format codes are reserved for future definition.

值的文本表示形式是特定数据类型输入/输出转换函数生成并接受的任何字符串。在传输表示形式中,没有尾随空字符;前端必须向接收到的值添加一个空字符,如果它希望将其作为 C 字符串处理。(顺便说一下,文本格式不允许嵌入空字符。)

The text representation of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type. In the transmitted representation, there is no trailing null character; the frontend must add one to received values if it wants to process them as C strings. (The text format does not allow embedded nulls, by the way.)

用于整数的二进制表示形式使用网络字节顺序(最高有效字节在前)。有关其他数据类型的详细信息,请查阅文档或源代码以了解二进制表示形式。请记住,复杂数据类型的二进制表示形式可能会在不同的服务器版本之间发生变化;文本格式通常是更便携的选择。

Binary representations for integers use network byte order (most significant byte first). For other data types consult the documentation or source code to learn about the binary representation. Keep in mind that binary representations for complex data types might change across server versions; the text format is usually the more portable choice.