Data Mining 简明教程
Data Mining - Systems
各种数据挖掘系统中可供选择。数据挖掘系统可能集成以下技术 −
There is a large variety of data mining systems available. Data mining systems may integrate techniques from the following −
-
Spatial Data Analysis
-
Information Retrieval
-
Pattern Recognition
-
Image Analysis
-
Signal Processing
-
Computer Graphics
-
Web Technology
-
Business
-
Bioinformatics
Data Mining System Classification
根据以下条件可以对数据挖掘系统进行分类 −
A data mining system can be classified according to the following criteria −
-
Database Technology
-
Statistics
-
Machine Learning
-
Information Science
-
Visualization
-
Other Disciplines

除此之外,还可以根据所挖掘的(a)数据库、所挖掘的(b)知识、所利用的(c)技术和所适应的(d)应用程序对数据挖掘系统进行分类。
Apart from these, a data mining system can also be classified based on the kind of (a) databases mined, (b) knowledge mined, (c) techniques utilized, and (d) applications adapted.
Classification Based on the Databases Mined
我们可以根据所挖掘的数据库类型对数据挖掘系统进行分类。可以根据数据模型、数据类型等不同条件对数据库系统进行分类。相应地,数据挖掘系统也进行分类。
We can classify a data mining system according to the kind of databases mined. Database system can be classified according to different criteria such as data models, types of data, etc. And the data mining system can be classified accordingly.
例如,如果我们根据数据模型对数据库进行分类,则可能有一个关系、事务、对象关系或数据仓库挖掘系统。
For example, if we classify a database according to the data model, then we may have a relational, transactional, object-relational, or data warehouse mining system.
Classification Based on the kind of Knowledge Mined
我们还可以根据所挖掘的知识类型对数据挖掘系统进行分类。这意味着根据诸如以下功能对数据挖掘系统进行分类−
We can classify a data mining system according to the kind of knowledge mined. It means the data mining system is classified on the basis of functionalities such as −
-
Characterization
-
Discrimination
-
Association and Correlation Analysis
-
Classification
-
Prediction
-
Outlier Analysis
-
Evolution Analysis
Classification Based on the Techniques Utilized
我们可以根据所用技术的类型对数据挖掘系统进行分类。我们可以根据涉及的用户交互程度或所采用的分析方法描述这些技术。
We can classify a data mining system according to the kind of techniques used. We can describe these techniques according to the degree of user interaction involved or the methods of analysis employed.
Integrating a Data Mining System with a DB/DW System
如果数据挖掘系统未与数据库或数据仓库系统集成,则将没有要通信的系统。此方案称为非耦合方案。在这个方案中,重点放在数据挖掘设计以及为挖掘可用数据集开发高效且有效的算法上。
If a data mining system is not integrated with a database or a data warehouse system, then there will be no system to communicate with. This scheme is known as the non-coupling scheme. In this scheme, the main focus is on data mining design and on developing efficient and effective algorithms for mining the available data sets.
集成方案如下列出 −
The list of Integration Schemes is as follows −
-
No Coupling − In this scheme, the data mining system does not utilize any of the database or data warehouse functions. It fetches the data from a particular source and processes that data using some data mining algorithms. The data mining result is stored in another file.
-
Loose Coupling − In this scheme, the data mining system may use some of the functions of database and data warehouse system. It fetches the data from the data respiratory managed by these systems and performs data mining on that data. It then stores the mining result either in a file or in a designated place in a database or in a data warehouse.
-
Semi−tight Coupling − In this scheme, the data mining system is linked with a database or a data warehouse system and in addition to that, efficient implementations of a few data mining primitives can be provided in the database.
-
Tight coupling − In this coupling scheme, the data mining system is smoothly integrated into the database or data warehouse system. The data mining subsystem is treated as one functional component of an information system.