Obiee 简明教程

OBIEE – Dimensional Modeling

维度建模提供了一组方法和概念,用于 DW 设计。根据 DW 顾问 Ralph Kimball 的说法,维度建模是一种数据库设计技术,旨在支持数据仓库中的最终用户查询。它以可理解性和性能为导向。他认为,虽然面向事务的 ER 对于事务捕获非常有用,但在最终用户交付中应避免这种情况。

Dimensional modeling provides set of methods and concepts that are used in DW design. According to DW consultant, Ralph Kimball, dimensional modeling is a design technique for databases intended to support end-user queries in a data warehouse. It is oriented around understandability and performance. According to him, although transaction-oriented ER is very useful for the transaction capture, it should be avoided for end-user delivery.

维度建模始终使用事实表和维度表。事实是可以在事实值上汇总和分析的数值。维度定义层次结构和对事实值的描述。

Dimensional modeling always uses facts and dimension tables. Facts are numerical values which can be aggregated and analyzed on the fact values. Dimensions define hierarchies and description on fact values.

Dimension Table

维度表存储描述事实表中对象的属性。维度表具有唯一标识每个维度行的主键。此键用于将维度表与事实表关联。

Dimension table stores the attributes that describe objects in a Fact table. A Dimension table has a primary key that uniquely identifies each dimension row. This key is used to associate the Dimension table to a Fact table.

维度表通常是非规范化的,因为它们不是为了执行事务而创建的,而仅仅用于对数据进行详细分析。

Dimension tables are normally de-normalized as they are not created to execute transactions and only used to analyze data in detail.

Example

在以下维度表中,客户维度通常包括客户姓名、地址、客户 ID、性别、收入组、教育水平等信息。

In the following dimension table, the customer dimension normally includes the name of customers, address, customer id, gender, income group, education levels, etc.

Customer ID

Name

Gender

Income

Education

Religion

1

Brian Edge

M

2

3

4

2

Fred Smith

M

3

5

1

3

Sally Jones

F

1

7

3

Fact Tables

事实表包含被称为度量的数值。事实表具有两种类型的列 - 事实和维度表的外国键。

Fact table contains numeric values that are known as measurements. A Fact table has two types of columns − facts and foreign key to dimension tables.

事实表中的度量分为三种类型:

Measures in Fact table are of three types −

  1. Additive − Measures that can be added across any dimension.

  2. Non-Additive − Measures that cannot be added across any dimension.

  3. Semi-Additive − Measures that can be added across some dimensions.

Example

Time ID

Product ID

Customer ID

Unit Sold

4

17

2

1

8

21

3

2

8

4

1

1

此事实表包含时间维度、产品维度、客户维度和度量值单位售出的外键。

This fact tables contains foreign keys for time dimension, product dimension, customer dimension and measurement value unit sold.

假设一家公司向客户销售产品。每次销售都是在公司内发生的事实,而事实表用于记录这些事实。

Suppose a company sells products to customers. Every sale is a fact that happens within the company, and the fact table is used to record these facts.

一般事实是对外销售的数量、利润、销售收入等。维度表列出一些因素(如客户、时间、产品等),按这些因素可对数据进行分析。

Common facts are − number of unit sold, margin, sales revenue, etc. The dimension table list factors like customer, time, product, etc. by which we want to analyze the data.

现在,如果我们考虑上述事实表和客户维度,那么也会有产品和时间维度。给定了此事实表和这三个维度表,我们可以询问诸如:2010 年有多少块手表卖给了男性客户?

Now if we consider the above Fact table and Customer dimension then there will also be a Product and time dimension. Given this fact table and these three dimension tables, we can ask questions like: How many watches were sold to male customers in 2010?

Difference between Dimension and Fact Table

维度表和事实表的功能区别在于,事实表保存我们想要分析的数据,而维度表保存允许我们对其进行查询所需的信息。

The functional difference between dimension tables and fact tables is that fact tables hold the data we want to analyze and dimension tables hold the information required to allow us to query it.

Aggregate Table

聚合表包含可通过使用不同的聚合函数计算的聚合数据。

Aggregate table contains aggregated data which can be calculated by using different aggregate functions.

aggregate function 是一个函数,在该函数中,多行值按特定条件合并为一个输入,以形成更重要意义或测量值的单个值。

An aggregate function is a function where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning or measurement.

常见的聚合函数包括 −

Common aggregate functions include −

  1. Average()

  2. Count()

  3. Maximum()

  4. Median()

  5. Minimum()

  6. Mode()

  7. Sum()

这些聚合表用于性能优化,以在数据仓库中运行复杂查询。

These aggregate tables are used for performance optimization to run complex queries in a data warehouse.

Example

您将聚合数据(如年(1 行)、季度(4 行)、月(12 行))保存到表中,现在您必须对数据进行比较,例如,每年只处理 1 行。然而,在未聚合表格中,所有行都将被处理。

You save tables with aggregated data like yearly (1 row), quarterly (4 rows), monthly (12 rows) and now you have to do comparison of data, like Yearly only 1 row will be processed. However in an un-aggregated table, all the rows will be processed.

MIN

Returns the smallest value in a given column

MAX

Returns the largest value in a given column

SUM

Returns the sum of the numeric values in a given column

AVG

Returns the average value of a given column

COUNT

Returns the total number of values in a given column

COUNT (*)

Returns the number of rows in a table

从 where title = ‘developer’ 的 employee 中选择 Avg (salary)。此语句将返回所有职称为“Developer”的员工的平均工资。

Select Avg (salary) from employee where title = ‘developer’. This statement will return the average salary for all employees whose title is equal to 'Developer'.

可在数据库级别应用聚合。您可以在数据库中创建聚合并将它们保存在聚合表中,也可以在报告级别动态地应用聚合。

Aggregations can be applied at database level. You can create aggregates and save them in aggregate tables in the database or you can apply aggregate on the fly at the report level.

Note − 如果在数据库级别保存聚合,它将节省时间并提供性能优化。

Note − If you save aggregates at the database level it saves time and provides performance optimization.