Postgresql 中文操作指南

53.51. pg_statistic #

目录 pg_statistic 存储有关数据库内容的统计数据。 ANALYZE 创建条目,随后查询计划器使用条目。请注意,所有统计数据从本质上都是近似的,即使假定它是最新的。

The catalog pg_statistic stores statistical data about the contents of the database. Entries are created by ANALYZE and subsequently used by the query planner. Note that all the statistical data is inherently approximate, even assuming that it is up-to-date.

在很多情况下,对于每个已分析的表列会有一条 stainherit = false 的条目。如果表具有继承子项或分区,还会创建 stainherit = true 的第二个条目。该行表示在继承树中该列的统计信息,即使用 SELECT _column FROM table * , whereas the _stainherit = false 会看到的数据的统计信息,行表示 SELECT _column FROM ONLY table_ 的结果。

Normally there is one entry, with stainherit = false, for each table column that has been analyzed. If the table has inheritance children or partitions, a second entry with stainherit = true is also created. This row represents the column’s statistics over the inheritance tree, i.e., statistics for the data you’d see with SELECT _column FROM table*, whereas the _stainherit = false row represents the results of SELECT _column FROM ONLY table_.

pg_statistic 还存储有关索引表达式值的统计数据。这些被描述为实际数据列;特别是,starelid 引用索引。但是,对于普通非表达式索引列不会创建任何条目,因为它将与基础表列的条目重复。目前,索引表达式的条目始终具有 stainherit = false

pg_statistic also stores statistical data about the values of index expressions. These are described as if they were actual data columns; in particular, starelid references the index. No entry is made for an ordinary non-expression index column, however, since it would be redundant with the entry for the underlying table column. Currently, entries for index expressions always have stainherit = false.

由于不同种类的统计信息可能适用于不同种类的​​数据,因此 pg_statistic 的设计并不依赖于它存储的统计信息的类型。只有极其通用的统计信息(例如空值)才在 pg_statistic 中有专门的列。其他所有内容都存储在“槽”中,槽是相关列的组,其内容由槽的列之一中的代码号标识。有关更多信息,请参见 src/include/catalog/pg_statistic.h

Since different kinds of statistics might be appropriate for different kinds of data, pg_statistic is designed not to assume very much about what sort of statistics it stores. Only extremely general statistics (such as nullness) are given dedicated columns in pg_statistic. Everything else is stored in “slots”, which are groups of associated columns whose content is identified by a code number in one of the slot’s columns. For more information see src/include/catalog/pg_statistic.h.

pg_statistic 不应可由公众读取,因为甚至表的统计信息也可能被视为敏感信息。(示例:工资列的最小值和最大值可能非常有趣。) pg_statspg_statistic 上一个可公开读取的视图,仅显示当前用户可读取的那些表的统计信息。

pg_statistic should not be readable by the public, since even statistical information about a table’s contents might be considered sensitive. (Example: minimum and maximum values of a salary column might be quite interesting.) pg_stats is a publicly readable view on pg_statistic that only exposes information about those tables that are readable by the current user.

Table 53.51. pg_statistic Columns

Table 53.51. pg_statistic Columns

Column Type

Description

starelid oid (references pg_class.oid)

The table or index that the described column belongs to

staattnum int2 (references pg_attribute.attnum)

The number of the described column

stainherit bool

If true, the stats include values from child tables, not just the values in the specified relation

stanullfrac float4

The fraction of the column’s entries that are null

stawidth int4

The average stored width, in bytes, of nonnull entries

stadistinct float4

The number of distinct nonnull data values in the column. A value greater than zero is the actual number of distinct values. A value less than zero is the negative of a multiplier for the number of rows in the table; for example, a column in which about 80% of the values are nonnull and each nonnull value appears about twice on average could be represented by stadistinct = -0.4. A zero value means the number of distinct values is unknown.

stakind_N_ int2

A code number indicating the kind of statistics stored in the N_th “slot” of the _pg_statistic row.

staop_N_ oid (references pg_operator.oid)

An operator used to derive the statistics stored in the N_th “slot”. For example, a histogram slot would show the _< operator that defines the sort order of the data. Zero if the statistics kind does not require an operator.

stacoll_N_ oid (references pg_collation.oid)

The collation used to derive the statistics stored in the _N_th “slot”. For example, a histogram slot for a collatable column would show the collation that defines the sort order of the data. Zero for noncollatable data.

stanumbers_N_ float4[]

Numerical statistics of the appropriate kind for the _N_th “slot”, or null if the slot kind does not involve numerical values

stavalues_N_ anyarray

Column data values of the appropriate kind for the N_th “slot”, or null if the slot kind does not store any data values. Each array’s element values are actually of the specific column’s data type, or a related type such as an array’s element type, so there is no way to define these columns' type more specifically than _anyarray.