Apache Presto 简明教程

Apache Presto - HIVE Connector

Hive 连接器允许查询存储在 Hive 数据仓库中的数据。

The Hive connector allows querying data stored in a Hive data warehouse.

Prerequisites

  1. Hadoop

  2. Hive

希望你在你的机器上安装了 Hadoop 和 Hive。在新终端中逐个启动所有服务。然后,使用以下命令启动 hive 元数据存储,

Hopefully you have installed Hadoop and Hive on your machine. Start all the services one by one in the new terminal. Then, start hive metastore using the following command,

hive --service metastore

Presto 使用 Hive 元数据存储服务来获取 Hive 表的详细信息。

Presto uses Hive metastore service to get the hive table’s details.

Configuration Settings

“etc/catalog” 目录下创建一个文件 “hive.properties” 。使用以下命令。

Create a file “hive.properties” under “etc/catalog” directory. Use the following command.

$ cd etc
$ cd catalog
$ vi hive.properties

connector.name = hive-cdh4
hive.metastore.uri = thrift://localhost:9083

完成所有更改后,保存文件并退出终端。

After making all the changes, save the file and quit the terminal.

Create Database

使用以下查询在 Hive 中创建数据库:

Create a database in Hive using the following query −

Query

hive> CREATE SCHEMA tutorials;

创建数据库后,你可以使用 “show databases” 命令进行验证。

After the database is created, you can verify it using the “show databases” command.

Create Table

创建表是用于在 Hive 中创建表的语句。例如,使用以下查询。

Create Table is a statement used to create a table in Hive. For example, use the following query.

hive> create table author(auth_id int, auth_name varchar(50),
topic varchar(100) STORED AS SEQUENCEFILE;

Insert Table

以下查询可用于向 hive 表中插入记录。

Following query is used to insert records in hive’s table.

hive> insert into table author values (1,’ Doug Cutting’,Hadoop),
(2,’ James Gosling’,java),(3,’ Dennis Ritchie’,C);

Start Presto CLI

你可以启动 Presto CLI 以使用以下命令连接 Hive 存储插件。

You can start Presto CLI to connect Hive storage plugin using the following command.

$ ./presto --server localhost:8080 --catalog hive —schema tutorials;

您将收到以下应答。

You will receive the following response.

presto:tutorials >

List Schemas

要列出 Hive 连接器中的所有模式,请键入以下命令。

To list out all the schemas in Hive connector, type the following command.

Query

presto:tutorials > show schemas from hive;

Result

default

tutorials

List Tables

要列出“教程”模式中的所有表,请使用以下查询。

To list out all the tables in “tutorials” schema, use the following query.

Query

presto:tutorials > show tables from hive.tutorials;

Result

author

Fetch Table

以下查询用于从 hive 表中抓取所有记录。

Following query is used to fetch all the records from hive’s table.

Query

presto:tutorials > select * from hive.tutorials.author;

Result

auth_id  |   auth_name    | topic
---------+----------------+--------
       1 | Doug Cutting   | Hadoop
       2 | James Gosling  | java
       3 | Dennis Ritchie | C