Big Data Analytics 简明教程

Big Data Adoption and Planning Considerations

采用大数据会带来其自身的一系列挑战和考虑因素,但通过仔细规划,组织可以最大化其好处。大数据举措应该具有战略性和业务导向。大数据的采用可以促进这种变革。大数据的利用可以具有变革性,但它通常具有创新性。变革活动通常风险较低,旨在提高效率和效力。

Adopting big data comes with its own set of challenges and considerations, but with careful planning, organizations can maximize its benefits. Big Data initiatives should be strategic and business-driven. The adoption of big data can facilitate this change. The use of Big Data can be transformative, but it is usually innovative. Transformation activities are often low-risk and aim to improve efficiency and effectiveness.

大数据的本质及其分析能力包含一些问题和挑战,需要在开始之前进行规划。例如,采用新技术的做法会引发有关是否符合现有公司标准的关注,需要解决这个问题。跟踪数据集从采购到利用来源相关的问题通常是组织的新需求。必须规划管理数据被处理或分析过程揭示了其身份的成分的隐私。

The nature of Big Data and its analytic power consists of issues and challenges that need to be planned in the beginning. For example, the adoption of new technology makes concerns to secure that conform to existing corporate standards needs to be addressed. Issues related to tracking the provenance of a dataset from its procurement to its utilization are often new requirements for organizations. It is necessary to plan for the management of the privacy of constituents whose data is being processed or whose identity is revealed by analytical processes.

所有上述因素都要求组织识别并实施一系列独特的治理流程和决策框架,以确保所涉及的所有方都理解大数据的性质、后果以及管理要求。随着大数据的采用,进行业务分析的方法正在发生改变。大数据分析生命周期是一种有效的解决方案。实施大数据时需要考虑不同的因素。

All of the aforementioned factors require that an organisation recognise and implement a set of distinct governance processes and decision frameworks to ensure that all parties involved understand the nature, consequences, and management requirements of Big Data. The approach to performing business analysis is changing with the adoption of Big Data. The Big Data analytics lifecycle is an effective solution. There are different factors to consider when we implement Big Data.

下面的图片描述了大数据的采用和计划注意事项

Following image depicts about big data adoption and planning considerations −

big data adoption and planning considerations

Big Data Adoption and Planning Considerations

主要潜在大数据采用和计划注意事项如下 −

The primary potential big data adoption and planning considerations are as −

Organization Prerequisites

大数据框架不是交钥匙解决方案。企业需要数据管理和大数据治理框架,才能使数据分析和分析发挥效用。需要有效的流程来实施、定制、填充和利用大数据解决方案。

Big Data frameworks are not turnkey solutions. Enterprises require data management and Big Data governance frameworks for data analysis and analytics to be useful. Effective processes are required for implementing, customising, filling, and utilising Big Data solutions.

Define Objectives

概述实施大数据的目标和目的。无论是提高客户体验、优化流程还是改善决策制订,明确的目标始终会为决策者制定战略指明积极的方向。

Outline your aims and objectives for implementing big data. Whether it’s increasing the customer experience, optimising processes, or improving decision-making, defined objectives always give a positive direction to the decision-makers to frame strategy.

Data Procurement

由于有开源平台和工具以及利用商品硬件的潜力,大数据解决方案的获取可能具有成本效益。获取外部数据仍可能需要大量的预算。必须购买大多数商业相关数据,这可能需要持续的订阅费用,以确保向获取的数据集提供更新。

The acquisition of Big Data solutions can be cost-effective, due to the availability of open-source platforms and tools, as well as the potential to leverage commodity hardware. A substantial budget may still be required to obtain external data. Most commercially relevant data will have to be purchased, which may necessitate continuing subscription expenses to ensure the delivery of updates to obtained datasets.

Infrastructure

评估当前的基础设施,看看它是否可以处理大数据的处理和分析。考虑是否需要投资新的硬件、软件或基于云的解决方案,以管理数据的量、速度和多样性。

Evaluate your current infrastructure to see if it can handle big data processing and analytics. Consider whether you need to invest in new hardware, software, or cloud-based solutions to manage the volume, velocity, and variety of data.

Data Strategy

制定一个与业务目标相一致的全面数据战略。这包括确定需要哪些类型的数据、在哪里获取它们、如何存储和管理它们以及如何确保其质量和安全。

Create a comprehensive data strategy that is aligned with your business objectives. This includes determining what sorts of data are required, where to obtain them, how to store and manage them, and how to ensure their quality and security.

Data Privacy and Security

对数据集进行分析可能会揭示关于组织或个人的机密数据。分析不同的数据集包括良性数据,而这些数据在对数据集进行集体审查时可能会揭示个人信息。解决这些隐私问题需要了解收集到的数据的性质以及相关数据隐私规则和特定数据标记和匿名化程序。随着时间的推移积累的诸如汽车 GPS 记录或智能电表数据读取等遥测数据可能会暴露个人的位置和行为。

Analytics on datasets may reveal confidential data about organisations or individuals. Analyzing different datasets includes benign data that can reveal private information when the datasets are reviewed collectively. Addressing these privacy concerns necessitates an awareness of the nature of the data being collected, as well as relevant data privacy rules and particular procedures for data tagging and anonymization. Telemetry data, such as a car’s GPS record or smart metre data readings, accumulated over a long period, might expose an individual’s location and behavior.

big data adoption and planning considerations1

安全性 使用身份验证和授权机制确保数据网络和存储库的安全性是大数据安全保护中的一个必不可少的元素。

Security ensures the security of data networks and repositories using authentication and authorization mechanisms is an essential element in securing big data.

big data adoption and planning considerations2

Provenance

来源是指有关数据的起源和处理的信息。来源信息用于确定数据的有效性和质量,也可用于审计。随着大数据使用不同的阶段进行收集、整合和处理,维护来源可能很困难。

Provenance refers to information about the data’s origins and processing. Provenance information is used to determine the validity and quality of data and can also be used for auditing. It can be difficult to maintain provenance as a large size of data is collected, integrated, and processed using different phases.

Limited Realtime Support

需要流式传输数据和警报的仪表盘和其他应用程序通常需要实时或准实时数据传输。不同的开源大数据解决方案和工具是面向批处理的;然而,实时开源技术的新阶段支持流式数据处理。

Dashboards and other applications that require streaming data and alerts frequently require real-time or near-realtime data transmissions. Different open-source Big Data solutions and tools are batch-oriented; however, a new phase of real-time open-source technologies supports streaming data processing.

Distinct Performance Challenges

对于大数据解决方案必须处理的大量数据,性能通常是一个问题。例如,海量数据集与高级搜索算法结合会导致较长的查询时间。

With the large amounts of data that Big Data solutions must handle, performance is frequently an issue. For example, massive datasets combined with advanced search algorithms can lead to long query times.

Distinct Governance Requirements

大数据解决方案访问并生成数据,这些数据成为企业资产。治理结构对于确保数据和解决方案环境都以受控的方式进行监管、标准化和进化至关重要。建立强有力的数据治理政策,以确保数据质量、完整性、隐私和符合 GDPR 和 CCPA 等法规。定义数据管理角色和职责,以及数据访问、使用和安全流程。

Big Data solutions access and generate data, which become corporate assets. A governance structure is essential to ensure that both the data and the solution environment are regulated, standardized, and evolved in a controlled way. Establish strong data governance policies to assure data quality, integrity, privacy, and compliance with legislation like GDPR and CCPA. Define data management roles and responsibilities, as well as data access, usage, and security processes.

Distinct Methodology

将会有必要建立一个机制来管理流入和流出大数据系统的数据。

A mechanism will be necessary to govern the flow of data into and out of Big Data systems.

big data adoption and planning considerations3

它需要探索如何构建反馈回路,以便可以再次修改已处理的数据。

It will need to explore how to construct feedback loops so that processed data can be revised again.

Continuous Improvement

大数据计划是迭代的,需要随着时间的推移不断开发。监控绩效指标,获取反馈,并微调您的策略,以确保您最大化利用您的数据投资。

Big data initiatives are iterative, and require on-going development over time. Monitor performance indicators, get feedback, and fine-tune your strategy to ensure that you’re getting the most out of your data investments.

通过仔细检查和规划这些因素,组织可以成功采用和利用大数据,以推动创新,提高效率并在当今数据驱动的世界中获得竞争优势。

By carefully examining and planning for these factors, organisations can successfully adopt and exploit big data to drive innovation, enhance efficiency, and gain a competitive advantage in today’s data-driven world.