Excel Data Analysis 简明教程
Data Analysis - Overview
数据分析是检查、清理、转换和建模数据的一个过程,目标是发现有用的信息,提出结论和支持决策制定
Data Analysis is a process of inspecting, cleaning, transforming and modeling data with the goal of discovering useful information, suggesting conclusions and supporting decision-making
Types of Data Analysis
有多种数据分析技术涉及各个领域(例如,商业、科学、社会科学等),并有各种名称。主要数据分析方法如下 −
Several data analysis techniques exist encompassing various domains such as business, science, social science, etc. with a variety of names. The major data analysis approaches are −
-
Data Mining
-
Business Intelligence
-
Statistical Analysis
-
Predictive Analytics
-
Text Analytics
Data Mining
数据挖掘是对大量数据进行分析,以提取以前未知的、有趣的数据模式、异常数据和依赖关系。请注意,目标是从大量数据中提取模式和知识,而不是提取数据本身。
Data Mining is the analysis of large quantities of data to extract previously unknown, interesting patterns of data, unusual data and the dependencies. Note that the goal is the extraction of patterns and knowledge from large amounts of data and not the extraction of data itself.
数据挖掘分析涉及计算机科学方法,方法是在人工智能、机器学习、统计和数据库系统的交叉点处。
Data mining analysis involves computer science methods at the intersection of the artificial intelligence, machine learning, statistics, and database systems.
从数据挖掘中获得的模式可以看作是输入数据的摘要,这些模式可以用于进一步分析或通过决策支持系统来获得更准确的预测结果。
The patterns obtained from data mining can be considered as a summary of the input data that can be used in further analysis or to obtain more accurate prediction results by a decision support system.
Business Intelligence
商业智能技术和工具用于获取和转换大量非结构化业务数据,以帮助识别、发展和创造新的战略商业机会。
Business Intelligence techniques and tools are for acquisition and transformation of large amounts of unstructured business data to help identify, develop and create new strategic business opportunities.
商业智能的目标是便于解释海量数据,从而发现新的机会。它有助于实施基于见解的有效策略,这些见解可以为企业提供竞争性的市场优势和长期稳定性。
The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities. It helps in implementing an effective strategy based on insights that can provide businesses with a competitive market-advantage and long-term stability.
Statistical Analysis
统计学是关于数据收集、分析、解释、展示和组织的研究。
Statistics is the study of collection, analysis, interpretation, presentation, and organization of data.
在数据分析中,使用了两种主要的统计方法:
In data analysis, two main statistical methodologies are used −
-
Descriptive statistics − In descriptive statistics, data from the entire population or a sample is summarized with numerical descriptors such as − Mean, Standard Deviation for Continuous Data Frequency, Percentage for Categorical Data
-
Inferential statistics − It uses patterns in the sample data to draw inferences about the represented population or accounting for randomness. These inferences can be − answering yes/no questions about the data (hypothesis testing) estimating numerical characteristics of the data (estimation) describing associations within the data (correlation) modeling relationships within the data (E.g. regression analysis)
Predictive Analytics
预测分析使用统计模型来分析当前和历史数据,以预测未来或其他未知事件。在商业中,预测分析用于识别辅助决策的风险和机会。
Predictive Analytics use statistical models to analyze current and historical data for forecasting (predictions) about future or otherwise unknown events. In business, predictive analytics is used to identify risks and opportunities that aid in decision-making.
Text Analytics
文本分析,也被称为文本挖掘或文本数据挖掘是从文本中派生高质量信息的过程。文本挖掘通常涉及构建输入文本,使用统计模式学习等手段在结构化数据中派生模式,以及最终对输出进行评估和解释的过程。
Text Analytics, also referred to as Text Mining or as Text Data Mining is the process of deriving high-quality information from text. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data using means such as statistical pattern learning, and finally evaluation and interpretation of the output.
Data Analysis Process
数据分析在 1961 年由统计学家 John Tukey 定义为:“分析数据的程序、解释此类程序结果的技术、规划数据收集以使其分析更为容易、更准确或更高精度的途径,以及适用于分析数据的(数学)统计的所有机制和结果。”
Data Analysis is defined by the statistician John Tukey in 1961 as "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.”
因此,数据分析是从各种来源获取大量非结构化数据并将其转换成有用的信息的流程:
Thus, data analysis is a process for obtaining large, unstructured data from various sources and converting it into information that is useful for −
-
Answering questions
-
Test hypotheses
-
Decision-making
-
Disproving theories
Data Analysis with Excel
Microsoft Excel 提供了几种分析和解释数据的方式和手段。数据可以来自各种来源。数据可以以多种方式进行转换和格式化。它可以使用相关 Excel 命令、函数和工具进行分析,这些工具包括条件格式,区域,表格,文本函数,日期函数,时间函数,财务函数,小计,快速分析,公式审计,查询工具,假设分析,求解器,数据模型,PowerPivot,PowerView,PowerMap,等等。
Microsoft Excel provides several means and ways to analyze and interpret data. The data can be from various sources. The data can be converted and formatted in several ways. It can be analyzed with the relevant Excel commands, functions and tools - encompassing Conditional Formatting, Ranges, Tables, Text functions, Date functions, Time functions, Financial functions, Subtotals, Quick Analysis, Formula Auditing, Inquire Tool, What-if Analysis, Solvers, Data Model, PowerPivot, PowerView, PowerMap, etc.
您将学习这些数据分析技术,把 Excel 分为两个部分:
You will be learning these data analysis techniques with Excel as part of two parts −
-
Data Analysis with Excel and
-
Advanced Data Analysis with Excel