Hadoop 简明教程

Hadoop - HDFS Operations

Starting HDFS

一开始,您必须格式化配置好的 HDFS 文件系统,打开名称节点(HDFS 服务器),然后执行以下命令。

Initially you have to format the configured HDFS file system, open namenode (HDFS server), and execute the following command.

$ hadoop namenode -format

格式化 HDFS 后,启动分布式文件系统。以下命令将启动名称节点以及作为集群的数据节点。

After formatting the HDFS, start the distributed file system. The following command will start the namenode as well as the data nodes as cluster.

$ start-dfs.sh

Listing Files in HDFS

将信息加载到服务器后,我们可以使用 ‘ls’ 查找目录中的文件列表、文件状态。以下是 ls 的语法,您可以将其作为参数传递给目录或文件名。

After loading the information in the server, we can find the list of files in a directory, status of a file, using ‘ls’. Given below is the syntax of ls that you can pass to a directory or a filename as an argument.

$ $HADOOP_HOME/bin/hadoop fs -ls <args>

Inserting Data into HDFS

假设我们在本地系统中名为 file.txt 的文件中具有数据,该数据应该保存在 hdfs 文件系统中。按照以下步骤将所需文件插入到 Hadoop 文件系统中。

Assume we have data in the file called file.txt in the local system which is ought to be saved in the hdfs file system. Follow the steps given below to insert the required file in the Hadoop file system.

Step 1

您必须创建一个输入目录。

You have to create an input directory.

$ $HADOOP_HOME/bin/hadoop fs -mkdir /user/input

Step 2

使用 put 命令将数据文件从本地系统传输并存储到 Hadoop 文件系统。

Transfer and store a data file from local systems to the Hadoop file system using the put command.

$ $HADOOP_HOME/bin/hadoop fs -put /home/file.txt /user/input

Step 3

您可以使用 ls 命令验证文件。

You can verify the file using ls command.

$ $HADOOP_HOME/bin/hadoop fs -ls /user/input

Retrieving Data from HDFS

假设我们在 HDFS 中有一个名为 outfile 的文件。以下是从 Hadoop 文件系统中检索所需文件的简单演示。

Assume we have a file in HDFS called outfile. Given below is a simple demonstration for retrieving the required file from the Hadoop file system.

Step 1

最初,使用 cat 命令从 HDFS 查看数据。

Initially, view the data from HDFS using cat command.

$ $HADOOP_HOME/bin/hadoop fs -cat /user/output/outfile

Step 2

使用 get 命令将文件从 HDFS 获取到本地文件系统。

Get the file from HDFS to the local file system using get command.

$ $HADOOP_HOME/bin/hadoop fs -get /user/output/ /home/hadoop_tp/

Shutting Down the HDFS

您可以使用以下命令关闭 HDFS。

You can shut down the HDFS by using the following command.

$ stop-dfs.sh