Hcatalog 简明教程
HCatalog - Create Table
本章介绍了如何创建表以及如何向其中插入数据。在 HCatalog 中创建表的约定与使用 Hive 创建表非常相似。
This chapter explains how to create a table and how to insert data into it. The conventions of creating a table in HCatalog is quite similar to creating a table using Hive.
Create Table Statement
Create Table 是一个用于使用 HCatalog 在 Hive Metastore 中创建表的语句。它的语法和示例如下 −
Create Table is a statement used to create a table in Hive metastore using HCatalog. Its syntax and example are as follows −
Syntax
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format]
Example
让我们假设您需要使用 CREATE TABLE 语句创建名为 employee 的表。下表列出了 employee 表中的字段及其数据类型 −
Let us assume you need to create a table named employee using CREATE TABLE statement. The following table lists the fields and their data types in the employee table −
Sr.No |
Field Name |
Data Type |
1 |
Eid |
int |
2 |
Name |
String |
3 |
Salary |
Float |
4 |
Designation |
string |
以下数据定义了 Comment 等受支持字段、行格式字段例如 Field terminator 、 Lines terminator 和 Stored File type 。
The following data defines the supported fields such as Comment, Row formatted fields such as Field terminator, Lines terminator, and Stored File type.
COMMENT ‘Employee details’
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED IN TEXT FILE
以下查询使用上述数据创建名为 employee 的表。
The following query creates a table named employee using the above data.
./hcat –e "CREATE TABLE IF NOT EXISTS employee ( eid int, name String,
salary String, destination String) \
COMMENT 'Employee details' \
ROW FORMAT DELIMITED \
FIELDS TERMINATED BY ‘\t’ \
LINES TERMINATED BY ‘\n’ \
STORED AS TEXTFILE;"
如果添加选项 IF NOT EXISTS ,则在表已存在的情况下,HCatalog 忽略该声明。
If you add the option IF NOT EXISTS, HCatalog ignores the statement in case the table already exists.
当表创建成功时,您可以看到以下响应:
On successful creation of table, you get to see the following response −
OK
Time taken: 5.905 seconds
Load Data Statement
总体上,在 SQL 中创建一个表之后,我们可以使用 Insert 声明插入数据。但在 HCatalog 中,我们使用 LOAD DATA 声明插入数据。
Generally, after creating a table in SQL, we can insert data using the Insert statement. But in HCatalog, we insert data using the LOAD DATA statement.
向 HCatalog 插入数据时,最好使用 LOAD DATA 来存储批量记录。有两种方法可用于加载数据:一种是从 local file system ,另一种是从 Hadoop file system 。
While inserting data into HCatalog, it is better to use LOAD DATA to store bulk records. There are two ways to load data: one is from local file system and second is from Hadoop file system.
Syntax
LOAD DATA 的语法如下:
The syntax for LOAD DATA is as follows −
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
[PARTITION (partcol1=val1, partcol2=val2 ...)]
-
LOCAL is the identifier to specify the local path. It is optional.
-
OVERWRITE is optional to overwrite the data in the table.
-
PARTITION is optional.
Example
我们将向表中插入以下数据。它是一个文本文件,名为 sample.txt ,位于 /home/user 目录中。
We will insert the following data into the table. It is a text file named sample.txt in /home/user directory.
1201 Gopal 45000 Technical manager
1202 Manisha 45000 Proof reader
1203 Masthanvali 40000 Technical writer
1204 Kiran 40000 Hr Admin
1205 Kranthi 30000 Op Admin
以下查询将给定的文本加载到表中。
The following query loads the given text into the table.
./hcat –e "LOAD DATA LOCAL INPATH '/home/user/sample.txt'
OVERWRITE INTO TABLE employee;"
下载成功后,你会看到以下响应−
On successful download, you get to see the following response −
OK
Time taken: 15.905 seconds