Hcatalog 简明教程

HCatalog Tutorial

HCatalog 是 Hadoop 的表存储管理工具,它将 Hive Metastore 的表格数据公开给其他 Hadoop 应用程序。它使用不同的数据处理工具(Pig、MapReduce)的用户可以轻松地将数据写在表单中。HCatalog 确保用户不必担心数据存储的位置或格式。这是一个简短的教程,解释了 HCatalog 的一些基本知识以及如何使用它。

HCatalog is a table storage management tool for Hadoop that exposes the tabular data of Hive metastore to other Hadoop applications. It enables users with different data processing tools (Pig, MapReduce) to easily write data onto a grid. HCatalog ensures that users don’t have to worry about where or in what format their data is stored. This is a small tutorial that explains just the basics of HCatalog and how to use it.

Audience

本教程针对有志于使用 Hadoop Framework 从事大数据分析职业的专业人士。ETL 开发人员和从事一般分析的专业人员也可以很好地利用本教程。

This tutorial is meant for professionals aspiring to make a career in Big Data Analytics using Hadoop Framework. ETL developers and professionals who are into analytics in general may as well use this tutorial to good effect.

Prerequisites

在继续学习本教程之前,您需要掌握 Core Java 的基本知识、SQL 数据库概念、Hadoop 文件系统,以及 Linux 操作系统的任何版本。

Before proceeding with this tutorial, you need a basic knowledge of Core Java, Database concepts of SQL, Hadoop File system, and any of Linux operating system flavors.