Hadoop 简明教程

Hadoop Tutorial

Hadoop是一个开源框架,允许在使用简单编程模型的计算机集群中分布式环境中存储和处理大数据。它旨在从单个服务器扩展到数千台机器,每台机器提供本地计算和存储。

Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

本简短教程快速介绍了大数据、MapReduce算法和Hadoop分布式文件系统。

This brief tutorial provides a quick introduction to Big Data, MapReduce algorithm, and Hadoop Distributed File System.

Audience

本教程是为有志于使用Hadoop框架学习大数据分析基础知识并成为Hadoop开发人员的专业人士准备的。软件专业人员、分析专业人员和 ETL 开发人员是本课程的主要受益者。

This tutorial has been prepared for professionals aspiring to learn the basics of Big Data Analytics using Hadoop Framework and become a Hadoop Developer. Software Professionals, Analytics Professionals, and ETL developers are the key beneficiaries of this course.

Prerequisites

在您开始学习本教程之前,我们假设您之前接触过Core Java、数据库概念和任何Linux操作系统风格。

Before you start proceeding with this tutorial, we assume that you have prior exposure to Core Java, database concepts, and any of the Linux operating system flavors.

Frequently Asked Questions about Hadoop

关于Hadoop有一些非常常见的问题(FAQ),本节将尝试简要回答这些问题。

There are some very Frequently Asked Questions(FAQ) about Hadoop, this section tries to answer them briefly.