Postgresql 中文操作指南

70.1. Introduction #

GIN 代表广义倒排索引。GIN 设计用于处理待索引项为复合值的情况,而由索引处理的查询需要搜索出现在复合项内的元素值。例如,项可以是文档,而查询可以是对包含特定单词的文档的搜索。

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items. For example, the items could be documents, and the queries could be searches for documents containing specific words.

我们使用单词 item 来指代要索引的复合值,并使用单词 key 来指代元素值。GIN 总是存储和搜索键,而不是项值本身。

We use the word item to refer to a composite value that is to be indexed, and the word key to refer to an element value. GIN always stores and searches for keys, not item values per se.

GIN 索引存储一组(键,发布列表)对,其中 posting list 是一个键出现的行 ID 集合。由于一个项可以包含多个键,因此相同的行 ID 可以出现在多个发布列表中。每个键值仅存储一次,因此对于键多次出现的案例,GIN 索引非常紧凑。

A GIN index stores a set of (key, posting list) pairs, where a posting list is a set of row IDs in which the key occurs. The same row ID can appear in multiple posting lists, since an item can contain more than one key. Each key value is stored only once, so a GIN index is very compact for cases where the same key appears many times.

GIN 的广义之处在于,GIN 访问方法代码不需要了解它加速的具体操作。相反,它使用为特定数据类型定义的自定义策略。该策略定义如何从索引项和查询条件中提取键,以及如何确定包含查询中某些键值的行是否实际地满足查询。

GIN is generalized in the sense that the GIN access method code does not need to know the specific operations that it accelerates. Instead, it uses custom strategies defined for particular data types. The strategy defines how keys are extracted from indexed items and query conditions, and how to determine whether a row that contains some of the key values in a query actually satisfies the query.

GIN 的一个优点是它允许由数据类型领域的专家,而不是数据库专家,使用合适访问方法来开发自定义数据类型。这与使用 GiST 的优势非常相似。

One advantage of GIN is that it allows the development of custom data types with the appropriate access methods, by an expert in the domain of the data type, rather than a database expert. This is much the same advantage as using GiST.

PostgreSQL 中的 GIN 实现主要由 Teodor Sigaev 和 Oleg Bartunov 维护。在他们的 website中有更多关于 GIN 的信息。

The GIN implementation in PostgreSQL is primarily maintained by Teodor Sigaev and Oleg Bartunov. There is more information about GIN on their website.