Apache Flume 简明教程

Apache Flume Tutorial

Flume 是一个用于将数据从各种数据生成器(Web 服务器)吸收到 Hadoop 中的标准、简单、可靠、灵活和可扩展的工具。在本教程中,我们将使用简单且形象的示例来解释 Apache Flume 的基础知识以及如何实际使用它。

Flume is a standard, simple, robust, flexible, and extensible tool for data ingestion from various data producers (webservers) into Hadoop. In this tutorial, we will be using simple and illustrative example to explain the basics of Apache Flume and how to use it in practice.

Audience

本教程面向所有希望了解如何使用 Apache Flume 从各种 Web 服务器传输日志和流数据到 HDFS 或 HBase 的专业人员。

This tutorial is meant for all those professionals who would like to learn the process of transferring log and streaming data from various webservers to HDFS or HBase using Apache Flume.

Prerequisites

要充分利用本教程,您应当充分了解 Hadoop 和 HDFS 命令的基础知识。

To make the most of this tutorial, you should have a good understanding of the basics of Hadoop and HDFS commands.