Postgresql 中文操作指南
75.2. System Catalog Initial Data #
任何手动创建的初始数据(有些没有)的目录都对应一个 .dat 文件,该文件以可编辑的格式包含其初始数据。
Each catalog that has any manually-created initial data (some do not) has a corresponding .dat file that contains its initial data in an editable format.
75.2.1. Data File Format #
每个 .dat 文件都包含 Perl 数据结构文字,这些文字可以被简单地求值以产生一个由哈希引用的数组组成内存中的数据结构,每个数组对应一个目录行。从 pg_database.dat 稍作修改的摘录将演示关键功能:
Each .dat file contains Perl data structure literals that are simply eval’d to produce an in-memory data structure consisting of an array of hash references, one per catalog row. A slightly modified excerpt from pg_database.dat will demonstrate the key features:
[
# A comment could appear here.
{ oid => '1', oid_symbol => 'Template1DbOid',
descr => 'database\'s default template',
datname => 'template1', encoding => 'ENCODING',
datlocprovider => 'LOCALE_PROVIDER', datistemplate => 't',
datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
datctype => 'LC_CTYPE', daticulocale => 'ICU_LOCALE', datacl => '_null_' },
]
需要指出的是:
Points to note:
75.2.2. OID Assignment #
可以通过编写 oid ⇒ _nnnn _ metadata field. Furthermore, if an OID is assigned, a C macro for that OID can be created by writing an _oid_symbol ⇒ _name 元数据字段来给数据中出现的编目行赋予一个手动指定的 OID。
A catalog row appearing in the initial data can be given a manually-assigned OID by writing an _oid ⇒ _nnnn metadata field. Furthermore, if an OID is assigned, a C macro for that OID can be created by writing an _oid_symbol ⇒ _name metadata field.
如果预加载的目录行有其他预加载的行对其引用的 OID,那么它必须有预分配的 OID。如果行的 OID 必须由 C 代码引用,那么还需要预分配的 OID。如果以上情况均不适用,则 oid 元数据字段可以被省略,这种情况下,引导代码会自动分配一个 OID。实际上,即使给定的目录中只有部分行实际上存在交叉引用,我们通常会对所有或没有预加载行的 OID 进行预分配。
Pre-loaded catalog rows must have preassigned OIDs if there are OID references to them in other pre-loaded rows. A preassigned OID is also needed if the row’s OID must be referenced from C code. If neither case applies, the oid metadata field can be omitted, in which case the bootstrap code assigns an OID automatically. In practice we usually preassign OIDs for all or none of the pre-loaded rows in a given catalog, even if only some of them are actually cross-referenced.
在 C 代码中编写任何 OID 的实际数值被认为是非常不好的行为;相反,请始终使用宏。对 pg_proc OID 的直接引用非常普遍,以至于有一种特殊机制可以自动创建必要的宏;请参阅 src/backend/utils/Gen_fmgrtab.pl。同样地,但出于历史原因,采用非相同的方式,有一种自动方法可以创建 pg_type OID 的宏。因此,在上述两个目录中,没有 oid_symbol 项。同样,对于系统目录和索引的 pg_class OID 的宏会自动设置。对于所有其他系统目录,您必须通过 oid_symbol 项手动指定需要的任何宏。
Writing the actual numeric value of any OID in C code is considered very bad form; always use a macro, instead. Direct references to pg_proc OIDs are common enough that there’s a special mechanism to create the necessary macros automatically; see src/backend/utils/Gen_fmgrtab.pl. Similarly — but, for historical reasons, not done the same way — there’s an automatic method for creating macros for pg_type OIDs. oid_symbol entries are therefore not necessary in those two catalogs. Likewise, macros for the pg_class OIDs of system catalogs and indexes are set up automatically. For all other system catalogs, you have to manually specify any macros you need via oid_symbol entries.
要查找可用于新预加载行的可用 OID,请运行 src/include/catalog/unused_oids 脚本。它会打印未使用的 OID 的包含范围(例如,输出行 45-900 表示尚未分配 OID 45 到 900)。目前,OID 1–9999 保留手工分配;unused_oids 脚本只需浏览目录标头和 .dat 文件,即可查看哪些未出现。您还可以使用 duplicate_oids 脚本来检查错误。(genbki.pl 将为任何未手动分配 OID 的行分配 OID,它还将检测编译时重复的 OID。)
To find an available OID for a new pre-loaded row, run the script src/include/catalog/unused_oids. It prints inclusive ranges of unused OIDs (e.g., the output line 45-900 means OIDs 45 through 900 have not been allocated yet). Currently, OIDs 1–9999 are reserved for manual assignment; the unused_oids script simply looks through the catalog headers and .dat files to see which ones do not appear. You can also use the duplicate_oids script to check for mistakes. (genbki.pl will assign OIDs for any rows that didn’t get one hand-assigned to them, and it will also detect duplicate OIDs at compile time.)
当选择不太可能立即提交的补丁的 OID 时,最佳做法是使用一组或多或少连续的 OID,从范围内随机选择的 8000—9999 开始。这会尽可能降低与同时开发的其他补丁产生 OID 冲突的风险。为了保持 8000—9999 范围可用于开发目的,在某个补丁提交到主 git 存储库之后,应将其 OID 重新编号到该范围内的可用空间。通常,这会在每个开发周期接近结束时进行,同时移动该周期中提交的补丁所消耗的所有 OID。可以将脚本 renumber_oids.pl 用于此目的。如果发现未提交补丁与最近提交的某些补丁存在 OID 冲突,renumber_oids.pl 也可以用于恢复该情况。
When choosing OIDs for a patch that is not expected to be committed immediately, best practice is to use a group of more-or-less consecutive OIDs starting with some random choice in the range 8000—9999. This minimizes the risk of OID collisions with other patches being developed concurrently. To keep the 8000—9999 range free for development purposes, after a patch has been committed to the master git repository its OIDs should be renumbered into available space below that range. Typically, this will be done near the end of each development cycle, moving all OIDs consumed by patches committed in that cycle at the same time. The script renumber_oids.pl can be used for this purpose. If an uncommitted patch is found to have OID conflicts with some recently-committed patch, renumber_oids.pl may also be useful for recovering from that situation.
由于有可能重新编号补丁分配的 OID 的惯例,补丁分配的 OID 在该补丁包含在正式发布版本中之前不应被视为稳定。然而,一旦发布,我们不会更改手动分配的对象 OID,因为那会产生各种兼容性问题。
Because of this convention of possibly renumbering OIDs assigned by patches, the OIDs assigned by a patch should not be considered stable until the patch has been included in an official release. We do not change manually-assigned object OIDs once released, however, as that would create assorted compatibility problems.
如果 genbki.pl 需要为没有手动分配的 OID 的目录条目分配一个 OID,它将使用 10000—11999 范围内的值。服务器的 OID 计数器在引导运行开始时设置为 10000,因此在引导处理过程中动态创建的任何对象也会在此范围内接收 OID。(通常的 OID 分配机制负责防止任何冲突。)
If genbki.pl needs to assign an OID to a catalog entry that does not have a manually-assigned OID, it will use a value in the range 10000—11999. The server’s OID counter is set to 10000 at the start of a bootstrap run, so that any objects created on-the-fly during bootstrap processing also receive OIDs in this range. (The usual OID assignment mechanism takes care of preventing any conflicts.)
OID 低于 FirstUnpinnedObjectId(12000)的对象被认为是“固定的”,防止它们被删除。(有少量的例外情况,它们是硬编码到 IsPinnedObject() 中。)initdb 强制 OID 计数器上升到 FirstUnpinnedObjectId,一旦它准备创建未固定的对象。因此,在 initdb 后期创建的对象(例如,在运行 information_schema.sql 脚本时创建的对象)不会被固定,而 genbki.pl 中已知的所有对象都会被固定。
Objects with OIDs below FirstUnpinnedObjectId (12000) are considered “pinned”, preventing them from being deleted. (There are a small number of exceptions, which are hard-wired into IsPinnedObject().) initdb forces the OID counter up to FirstUnpinnedObjectId as soon as it’s ready to create unpinned objects. Thus objects created during the later phases of initdb, such as objects created while running the information_schema.sql script, will not be pinned, while all objects known to genbki.pl will be.
正常数据库操作过程中分配的 OID 限制为 16384 或更高。这确保了 10000—16383 范围可用于自动分配的 OID genbki.pl 或在 initdb 期间分配的 OID。这些自动分配的 OID 不被认为稳定,并且可能因安装而异。
OIDs assigned during normal database operation are constrained to be 16384 or higher. This ensures that the range 10000—16383 is free for OIDs assigned automatically by genbki.pl or during initdb. These automatically-assigned OIDs are not considered stable, and may change from one installation to another.
75.2.3. OID Reference Lookup #
原则上,从一个初始目录行到另一个初始目录行的交叉引用可以通过在引用字段中写入被引用行的预分配 OID 来写。但是,这违背了项目的策略,因为它容易出错、难以阅读,并且如果重新编号了新分配的 OID,则容易损坏。因此 genbki.pl 提供机制来编写符号引用。规则如下:
In principle, cross-references from one initial catalog row to another could be written just by writing the preassigned OID of the referenced row in the referencing field. However, that is against project policy, because it is error-prone, hard to read, and subject to breakage if a newly-assigned OID is renumbered. Therefore genbki.pl provides mechanisms to write symbolic references instead. The rules are as follows:
genbki.pl 在运行时解析所有符号引用,并将简单的数字 OID 放入发出的 BKI 文件中。因此,引导后端无需处理符号引用。
genbki.pl resolves all symbolic references while it runs, and puts simple numeric OIDs into the emitted BKI file. There is therefore no need for the bootstrap backend to deal with symbolic references.
即使目录没有需要查找的初始数据,使用 BKI_LOOKUP 或 BKI_LOOKUP_OPT 标记 OID 引用列也是很有必要的。这允许 genbki.pl 记录存在于系统目录中的外键关系。该信息用于回归测试中,以检查不正确的条目。另请参阅宏 DECLARE_FOREIGN_KEY、DECLARE_FOREIGN_KEY_OPT、DECLARE_ARRAY_FOREIGN_KEY 和 DECLARE_ARRAY_FOREIGN_KEY_OPT,它们用于声明对 BKI_LOOKUP 来说过于复杂的复杂外键关系(通常是多列外键关系)。
It’s desirable to mark OID reference columns with BKI_LOOKUP or BKI_LOOKUP_OPT even if the catalog has no initial data that requires lookup. This allows genbki.pl to record the foreign key relationships that exist in the system catalogs. That information is used in the regression tests to check for incorrect entries. See also the macros DECLARE_FOREIGN_KEY, DECLARE_FOREIGN_KEY_OPT, DECLARE_ARRAY_FOREIGN_KEY, and DECLARE_ARRAY_FOREIGN_KEY_OPT, which are used to declare foreign key relationships that are too complex for BKI_LOOKUP (typically, multi-column foreign keys).
75.2.4. Automatic Creation of Array Types #
大部分标量数据类型应该有相应的数组类型(即,标准元素类型是标量类型的可变长度数组类型,并且由标量类型的 pg_type 条目的 typarray 字段引用)。genbki.pl 在大多数情况下能够自动生成数组类型的 pg_type 条目。
Most scalar data types should have a corresponding array type (that is, a standard varlena array type whose element type is the scalar type, and which is referenced by the typarray field of the scalar type’s pg_type entry). genbki.pl is able to generate the pg_type entry for the array type automatically in most cases.
要使用此设施,只需在标量类型的 pg_type 中填写 array_type_oid ⇒ _nnnn_ 元数据字段,并指定用于数组类型 的 OID 即可。您随后可以忽略 typarray 字段,因为它会自动使用该 OID 进行填充。
To use this facility, just write an array_type_oid ⇒ _nnnn_ metadata field in the scalar type’s pg_type entry, specifying the OID to use for the array type. You may then omit the typarray field, since it will be filled automatically with that OID.
生成的数组类型的名称是标量类型的名称,前面加上下划线。数组条目的其他字段由 pg_type.h 中的 BKI_ARRAY_DEFAULT(_value) 注释填充,如果没有注释,则从标量类型复制。(对于 typalign 还有一个特殊情况。)然后将两个条目的 typelem 和 typarray 字段设置为交叉引用彼此。
The generated array type’s name is the scalar type’s name with an underscore prepended. The array entry’s other fields are filled from BKI_ARRAY_DEFAULT(_value)_ annotations in pg_type.h, or if there isn’t one, copied from the scalar type. (There’s also a special case for typalign.) Then the typelem and typarray fields of the two entries are set to cross-reference each other.
75.2.5. Recipes for Editing Data Files #
以下是一些关于在更新目录数据文件时执行常见任务的最简单方法的建议。
Here are some suggestions about the easiest ways to perform common tasks when updating catalog data files.
*Add a new column with a default to a catalog: * 使用 BKI_DEFAULT(_value) 注释将列添加到标头文件。数据文件只需要通过在现有行中添加需要非默认值来调整字段即可。
*Add a new column with a default to a catalog: * Add the column to the header file with a BKI_DEFAULT(_value)_ annotation. The data file need only be adjusted by adding the field in existing rows where a non-default value is needed.
*Add a default value to an existing column that doesn’t have one: * 向标头文件添加 BKI_DEFAULT 注释,然后运行 make reformat-dat-files 以删除现在冗余的字段条目。
*Add a default value to an existing column that doesn’t have one: * Add a BKI_DEFAULT annotation to the header file, then run make reformat-dat-files to remove now-redundant field entries.
*Remove a column, whether it has a default or not: * 从标头中删除列,然后运行 make reformat-dat-files 以删除现在无用的字段条目。
*Remove a column, whether it has a default or not: * Remove the column from the header, then run make reformat-dat-files to remove now-useless field entries.
*Change or remove an existing default value: * 您不能简单地更改标头文件,因为那样会导致错误解释当前数据。首先运行 make expand-dat-files 用明确插入的所有默认值重写数据文件,然后更改或删除 BKI_DEFAULT 注释,然后再次运行 make reformat-dat-files 以删除多余的字段。
*Change or remove an existing default value: * You cannot simply change the header file, since that will cause the current data to be interpreted incorrectly. First run make expand-dat-files to rewrite the data files with all default values inserted explicitly, then change or remove the BKI_DEFAULT annotation, then run make reformat-dat-files to remove superfluous fields again.
*Ad-hoc bulk editing: * reformat_dat_file.pl 可以适应执行多种批量更改。查找显示可以在其中插入一次性代码的块注释。在以下示例中,我们将 pg_proc 中的两个布尔字段合并到一个字符字段中:
*Ad-hoc bulk editing: * reformat_dat_file.pl can be adapted to perform many kinds of bulk changes. Look for its block comments showing where one-off code can be inserted. In the following example, we are going to consolidate two Boolean fields in pg_proc into a char field:
有关用于批量编辑的脚本的更多示例,请参阅附加到此邮件的 convert_oid2name.pl 和 remove_pg_type_oid_symbols.pl : link:https://www.postgresql.org/message-id/CAJVSVGVX8gXnPm+Xa= DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com[ link:https://www.postgresql.org/message-id/CAJVSVGVX8gXnPm+Xa= DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com]
For further examples of scripts used for bulk editing, see convert_oid2name.pl and remove_pg_type_oid_symbols.pl attached to this message: DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com">DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com