Map Reduce 简明教程
MapReduce - Hadoop Administration
本章介绍了 Hadoop 管理,包括 HDFS 和 MapReduce 管理。
This chapter explains Hadoop administration which includes both HDFS and MapReduce administration.
-
HDFS administration includes monitoring the HDFS file structure, locations, and the updated files.
-
MapReduce administration includes monitoring the list of applications, configuration of nodes, application status, etc.
HDFS Monitoring
HDFS(Hadoop 分布式文件系统)包含用户目录、输入文件和输出文件。使用 MapReduce 命令 put 和 get, 进行存储和检索。
HDFS (Hadoop Distributed File System) contains the user directories, input files, and output files. Use the MapReduce commands, put and get, for storing and retrieving.
通过在“/$HADOOP_HOME/sbin”上发送命令“start-all.sh”启动 Hadoop 框架(守护进程)后,将以下 URL 发送到浏览器“http://localhost:50070”。您应该在浏览器上看到以下屏幕。
After starting the Hadoop framework (daemons) by passing the command “start-all.sh” on “/$HADOOP_HOME/sbin”, pass the following URL to the browser “http://localhost:50070”. You should see the following screen on your browser.
以下屏幕截图显示了如何浏览 HDFS。
The following screenshot shows how to browse the browse HDFS.

以下屏幕截图显示了 HDFS 的文件结构。此图片显示了“/user/hadoop”目录中的文件。
The following screenshot show the file structure of HDFS. It shows the files in the “/user/hadoop” directory.

以下屏幕截图显示了集群中的数据节点信息。在此您可以找到一个节点及其配置和容量。
The following screenshot shows the Datanode information in a cluster. Here you can find one node with its configurations and capacities.

MapReduce Job Monitoring
MapReduce 应用程序是一组作业(映射作业、合并器、分区器、还原作业)。有必要监控并维护以下内容:
A MapReduce application is a collection of jobs (Map job, Combiner, Partitioner, and Reduce job). It is mandatory to monitor and maintain the following −
-
Configuration of datanode where the application is suitable.
-
The number of datanodes and resources used per application.
为监控所有这些事情,应该必须有一个用户界面。在 “/$HADOOP_HOME/sbin”上传递 “start-all.sh”来开始 Hadoop 框架,并发送以下 URL 到浏览器 “http://localhost:8080”。您应在浏览器上看到以下屏幕。
To monitor all these things, it is imperative that we should have a user interface. After starting the Hadoop framework by passing the command “start-all.sh” on “/$HADOOP_HOME/sbin”, pass the following URL to the browser “http://localhost:8080”. You should see the following screen on your browser.

在上方的屏幕截图中,手点针在 application ID 上。只需点击它即可在浏览器上找到以下屏幕。它说明了以下内容:
In the above screenshot, the hand pointer is on the application ID. Just click on it to find the following screen on your browser. It describes the following −
-
On which user the current application is running
-
The application name
-
Type of that application
-
Current status, Final status
-
Application started time, elapsed (completed time), if it is complete at the time of monitoring
-
The history of this application, i.e., log information
-
And finally, the node information, i.e., the nodes that participated in running the application.
以下屏幕截图显示了特定应用程序的详细信息−
The following screenshot shows the details of a particular application −

以下屏幕截图描述了当前正在运行的节点信息。此处,屏幕截图仅包含一个节点。手形指针显示正在运行的节点的本地主机地址。
The following screenshot describes the currently running nodes information. Here, the screenshot contains only one node. A hand pointer shows the localhost address of the running node.
