Teradata 简明教程
Teradata - Performance Tuning
本章讨论 Teradata 中的性能调优过程。
This chapter discusses the procedure of performance tuning in Teradata.
Explain
性能调优的第一步是针对查询使用 EXPLAIN。EXPLAIN 计划提供优化器执行查询的详细信息。在 EXPLAIN 计划中,检查关键字(如置信级别、已使用的联接策略、假脱机文件大小、重新分配等)。
The first step in performance tuning is the use of EXPLAIN on your query. EXPLAIN plan gives the details of how optimizer will execute your query. In the Explain plan, check for the keywords like confidence level, join strategy used, spool file size, redistribution, etc.
Collect Statistics
优化器使用数据人口统计数据提出有效的执行策略。COLLECT STATISTICS 命令用于收集表的数据人口统计数据。确保收集的列上的统计数据是最新的。
Optimizer uses Data demographics to come up with effective execution strategy. COLLECT STATISTICS command is used to collect data demographics of the table. Make sure that the statistics collected on the columns are up to date.
-
Collect statistics on the columns that are used in WHERE clause and on the columns used in the joining condition.
-
Collect statistics on the Unique Primary Index columns.
-
Collect statistics on Non Unique Secondary Index columns. Optimizer will decide if it can use NUSI or Full Table Scan.
-
Collect statistics on the Join Index though the statistics on base table is collected.
-
Collect statistics on the partitioning columns.
Data Types
确保使用适当的数据类型。这将避免使用超出所需的大量存储空间。
Make sure that proper data types are used. This will avoid the use of excessive storage than required.
Conversion
确保联接条件所用列的数据类型兼容,以避免显式数据转换。
Make sure that the data types of the columns used in join condition are compatible to avoid explicit data conversions.
Spool Space Issue
如果查询超过用户为每个 AMP 分配的卷轴空间限制,则会生成卷轴空间错误。验证执行计划并确定占用更多卷轴空间的步骤。可以将这些中间查询拆分并分别放置以构建临时表。
Spool space error is generated if the query exceeds per AMP spool space limit for that user. Verify the explain plan and identify the step that consumes more spool space. These intermediate queries can be split and put as separately to build temporary tables.
Primary Index
确保表的主索引正确定义。主索引列应均匀分布数据,并且应经常用于访问数据。
Make sure that the Primary Index is correctly defined for the table. The primary index column should evenly distribute the data and should be frequently used to access the data.
SET Table
如果您定义 SET 表,则优化器会检查对于所插入的每条记录,是否存在重复记录。若要删除重复检查条件,您可以为表定义唯一二级索引。
If you define a SET table, then the optimizer will check if the record is duplicate for each and every record inserted. To remove the duplicate check condition, you can define Unique Secondary Index for the table.
UPDATE on Large Table
更新大型表会很耗时。您可以删除记录并插入包含修改后的行的新记录,而不是更新表。
Updating the large table will be time consuming. Instead of updating the table, you can delete the records and insert the records with modified rows.