Dwh 简明教程
Data Warehousing - System Managers
系统管理对于数据仓库的成功实施至关重要。最重要的系统管理器包括 −
System management is mandatory for the successful implementation of a data warehouse. The most important system managers are −
-
System configuration manager
-
System scheduling manager
-
System event manager
-
System database manager
-
System backup recovery manager
System Configuration Manager
-
The system configuration manager is responsible for the management of the setup and configuration of data warehouse.
-
The structure of configuration manager varies from one operating system to another.
-
In Unix structure of configuration, the manager varies from vendor to vendor.
-
Configuration managers have single user interface.
-
The interface of configuration manager allows us to control all aspects of the system.
Note − 最重要的配置工具是 I/O 管理器。
Note − The most important configuration tool is the I/O manager.
System Scheduling Manager
系统调度管理器负责数据仓库的成功实施。其目的是调度临时查询。每个操作系统都有自己 的调度器,并采用某种形式的批处理控制机制。系统调度管理器必须具备的功能列表如下 −
System Scheduling Manager is responsible for the successful implementation of the data warehouse. Its purpose is to schedule ad hoc queries. Every operating system has its own scheduler with some form of batch control mechanism. The list of features a system scheduling manager must have is as follows −
-
Work across cluster or MPP boundaries
-
Deal with international time differences
-
Handle job failure
-
Handle multiple queries
-
Support job priorities
-
Restart or re-queue the failed jobs
-
Notify the user or a process when job is completed
-
Maintain the job schedules across system outages
-
Re-queue jobs to other queues
-
Support the stopping and starting of queues
-
Log Queued jobs
-
Deal with inter-queue processing
Note − 上述列表可用作评估良好调度器的评估参数。
Note − The above list can be used as evaluation parameters for the evaluation of a good scheduler.
调度器必须能够处理的一些重要任务包括 −
Some important jobs that a scheduler must be able to handle are as follows −
-
Daily and ad hoc query scheduling
-
Execution of regular report requirements
-
Data load
-
Data processing
-
Index creation
-
Backup
-
Aggregation creation
-
Data transformation
Note −如果数据仓库在一个群集或 MPP 体系结构上运行,那么系统调度管理器一定要能够跨体系结构运行。
Note − If the data warehouse is running on a cluster or MPP architecture, then the system scheduling manager must be capable of running across the architecture.
System Event Manager
事件管理器是一种软件。事件管理器管理数据仓库系统上所定义的事件。我们无法手动管理数据仓库,因为数据仓库的结构非常复杂。因此,我们需要一种无需用户干预就能自动处理所有事件的工具。
The event manager is a kind of a software. The event manager manages the events that are defined on the data warehouse system. We cannot manage the data warehouse manually because the structure of data warehouse is very complex. Therefore we need a tool that automatically handles all the events without any intervention of the user.
Note −事件管理器监视事件发生并处理它们。事件管理器还会追踪这个复杂的数据仓库系统中可能出现的大量问题。
Note − The Event manager monitors the events occurrences and deals with them. The event manager also tracks the myriad of things that can go wrong on this complex data warehouse system.
Events
事件是由用户或系统本身生成的活动。可以注意到,事件是一个定义活动的可度量、可观察的发生。
Events are the actions that are generated by the user or the system itself. It may be noted that the event is a measurable, observable, occurrence of a defined action.
下面是一个需要追踪的常见事件列表。
Given below is a list of common events that are required to be tracked.
-
Hardware failure
-
Running out of space on certain key disks
-
A process dying
-
A process returning an error
-
CPU usage exceeding an 805 threshold
-
Internal contention on database serialization points
-
Buffer cache hit ratios exceeding or failure below threshold
-
A table reaching to maximum of its size
-
Excessive memory swapping
-
A table failing to extend due to lack of space
-
Disk exhibiting I/O bottlenecks
-
Usage of temporary or sort area reaching a certain thresholds
-
Any other database shared memory usage
事件最重要的事情是,它们应该能够自己执行。事件包定义预定义事件的过程。与每个事件关联的代码称为事件处理程序。此代码在发生事件时执行。
The most important thing about events is that they should be capable of executing on their own. Event packages define the procedures for the predefined events. The code associated with each event is known as event handler. This code is executed whenever an event occurs.
System and Database Manager
系统和数据库管理器可能是两个独立的软件,但它们执行相同的工作。这些工具的目标是自动化某些过程并简化其他过程的执行。选择系统和数据库管理器的标准如下 −
System and database manager may be two separate pieces of software, but they do the same job. The objective of these tools is to automate certain processes and to simplify the execution of others. The criteria for choosing a system and the database manager are as follows −
-
increase user’s quota.
-
assign and de-assign roles to the users
-
assign and de-assign the profiles to the users
-
perform database space management
-
monitor and report on space usage
-
tidy up fragmented and unused space
-
add and expand the space
-
add and remove users
-
manage user password
-
manage summary or temporary tables
-
assign or deassign temporary space to and from the user
-
reclaim the space form old or out-of-date temporary tables
-
manage error and trace logs
-
to browse log and trace files
-
redirect error or trace information
-
switch on and off error and trace logging
-
perform system space management
-
monitor and report on space usage
-
clean up old and unused file directories
-
add or expand space.
System Backup Recovery Manager
备份和恢复工具使用户操作和管理人员可以轻松备份数据。注意,系统备份管理器必须与正在使用的计划管理器软件集成。备份管理所需的重要功能如下:
The backup and recovery tool makes it easy for operations and management staff to back-up the data. Note that the system backup manager must be integrated with the schedule manager software being used. The important features that are required for the management of backups are as follows −
-
Scheduling
-
Backup data tracking
-
Database awareness
仅进行备份以防止数据丢失。以下是要记住的重要事项:
Backups are taken only to protect against data loss. Following are the important points to remember −
-
The backup software will keep some form of database of where and when the piece of data was backed up.
-
The backup recovery manager must have a good front-end to that database.
-
The backup recovery software should be database aware.
-
Being aware of the database, the software then can be addressed in database terms, and will not perform backups that would not be viable.