Postgresql 中文操作指南

F.46. tsm_system_rows — the SYSTEM_ROWS sampling method for TABLESAMPLE #

tsm_system_rows 模块提供表采样方法 SYSTEM_ROWS ,它可以在 SELECT 命令的 TABLESAMPLE 子句中使用。

The tsm_system_rows module provides the table sampling method SYSTEM_ROWS, which can be used in the TABLESAMPLE clause of a SELECT command.

此表采样方法接受一个单个整数参数,该参数是要读取的最大行数。除非表未包含足够行(在这种情况下将选择整个表),否则最终样本将总是包含恰好那么多行。

This table sampling method accepts a single integer argument that is the maximum number of rows to read. The resulting sample will always contain exactly that many rows, unless the table does not contain enough rows, in which case the whole table is selected.

与内置 SYSTEM 采样方法类似,SYSTEM_ROWS 执行块级采样,因此样本不是完全随机的,但可能受群集效应影响,尤其是当仅请求少数行时。

Like the built-in SYSTEM sampling method, SYSTEM_ROWS performs block-level sampling, so that the sample is not completely random but may be subject to clustering effects, especially if only a small number of rows are requested.

SYSTEM_ROWS 不支持 REPEATABLE 子句。

SYSTEM_ROWS does not support the REPEATABLE clause.

此模块被认为是“受信任的”,也就是说,它可以由在当前数据库上具有 CREATE 权限的非超级用户安装。

This module is considered “trusted”, that is, it can be installed by non-superusers who have CREATE privilege on the current database.

F.46.1. Examples #

以下是如何使用 SYSTEM_ROWS 选择表的样本的示例。首先安装该扩展:

Here is an example of selecting a sample of a table with SYSTEM_ROWS. First install the extension:

CREATE EXTENSION tsm_system_rows;

然后,你可以在 SELECT 命令中使用它,例如:

Then you can use it in a SELECT command, for instance:

SELECT * FROM my_table TABLESAMPLE SYSTEM_ROWS(100);

此命令将从表 my_table 中返回 100 行样本(除非该表没有 100 行可见行,在这种情况下将返回其所有行)。

This command will return a sample of 100 rows from the table my_table (unless the table does not have 100 visible rows, in which case all its rows are returned).