Snowflake 简明教程

Snowflake - Load Data From Files

在数据库中,创建模式,它是表的逻辑分组。表包含列。表和列是低级别的,并且是数据库中最重要的对象。现在,表和列中最重要的功能是存储数据。

In a database, Schemas are created which are logical grouping of tables. Tables contain columns. Tables and columns are low-level and the most important objects of a database. Now, the most important function of table & columns is storing the data.

在本节中,我们将讨论如何在 Snowflake 中将数据存储到表和列中。

In this chapter, we will discuss about how to store data into table and columns in Snowflake.

Snowflake 通过用户界面和 SQL 查询为用户提供了两种将数据存储到表和相应列中的方法。

Snowflake provides the user two ways to store data into a table and corresponding columns using user interface and SQL query.

Load Data into Tables and Columns using Snowflake’s UI

在本章节中,我们将讨论应该遵循的步骤,以使用 CSV、JSON、XML、Avro、ORC、Parquet 等文件将数据加载到表及其相应列中。

In this section, we will discuss the steps that should be followed to load data into a table and its corresponding columns using a file like CSV, JSON, XML, Avro, ORC, Parquet.

此方法仅限于加载少于 50 MB 的少量数据。

This approach is limited to load a small amount of data up to 50 MB.

以任何格式创建示例文件。在创建文件时,请确保文件中的列数与表中的列数相匹配,否则在加载数据时操作将失败。

Create a sample file in any of the format. While creating the file, make sure the number of columns in the file and the table should match, otherwise the operation will fail while loading the data.

在 TEST_DB.TEST_SCHEMA.TEST_TABLE 中,共有三列:ID、NAME 和 ADDRESS。

In TEST_DB.TEST_SCHEMA.TEST_TABLE, there are three columns: ID, NAME, and ADDRESS.

以下示例数据已在“data.csv”中创建。

The following sample data is created in "data.csv" −

ID

NAME

ADDRESS

1

aa

abcd

2

ab

abcd

3

aa

abcd

4

ab

abcd

5

aa

abcd

6

ab

abcd

7

aa

abcd

8

ab

abcd

9

aa

abcd

现在,单击位于顶部功能区的图标。单击要上传数据的表名。它显示列数和定义。

Now, click the DATABSES icon present at the top ribbon. Click the table name where you want to upload data. It shows the number of columns and definition.

以下屏幕截图显示“加载数据”功能。

The following screenshot shows the Load Data functionality −

load data funtionality

针对列重新验证示例文件。单击列名称顶部的按钮。它会弹出“加载数据”对话框。在第一个屏幕上,选择仓库名称并单击按钮。

Reverify the sample file with respect to columns. Click the Load Table button at the top of column names. It pops-up Load Data dialog box. At the first screen, select Warehouse Name and click the Next button.

load data pop up

在下一个屏幕上,单击,从本地计算机中选择一个文件。上传文件后,您可以看到该文件名,如以下屏幕截图所示。单击按钮。

On the next screen, select a file from your local computer by clicking Select Files. Once the file is uploaded, you can see the file name as shown in following screenshot. Click the Next button.

select files

现在通过单击,创建文件格式,如以下屏幕截图所示。

Now create the File Format by clicking + sign as shown in the following screenshot −

create file format

它会弹出对话框。输入以下详细信息。

It pops-up the Create File Format dialog box. Enter the following details −

  1. Name − Name of file format.

  2. Schema Name − The create file format can be utilized in the given schema only.

  3. Format Type − Name of file format.

  4. Column separator − If CSV file is separated, provide file delimiter.

  5. Row separator − How to identify a new line.

  6. Header lines to skip − If header is provided, then 1 else 0.

其他内容可以保持不变。在输入详细信息后单击 Finish 按钮。

Other things can be left as it is. Click the Finish button after entering details.

以下屏幕截图显示了上述详细信息 -

The following screenshot displays the above details −

create file format details

从下拉列表中选择文件格式,然后单击 Load ,如下面的屏幕截图所示 -

Select the File Format from the dropdown and click Load as shown in the following screenshot −

click load

加载结果后,您将获得摘要,如下所示。单击 OK 按钮。

After loading the results, you will get the summary, as shown below. Click the OK button.

click ok

要查看数据,请运行“SELECT * from TEST_TABLE”查询。在左面板中,用户还可以查看数据库、架构和表详细信息。

To view the data, run the query "SELECT * from TEST_TABLE". In the left panel also, user can see DB, Schema and table details.

run the query

Load Data into Tables and Columns using SQL

要从本地文件加载数据,可以执行以下步骤 -

To load data from a local file, you can take the following steps −

使用 SnowSQL(Snowflake 提供的插件)将文件上载到 Snowflake 的暂存区域。要执行此操作,请转到 help ,然后单击“下载”,如下所示 -

Upload a file into Snowflake’s stage using SnowSQL, a plugin provided by Snowflake. To perform it, go to help and click on Download as shown below −

go to help and click download

单击 CLI 客户端 (snowsql),然后单击 Snowflake Repository ,如下面的屏幕截图所示 -

Click the CLI Client (snowsql) and click the Snowflake Repository as shown in the following screenshot −

click cli client

用户可以转到 bootstrap → 1.2 → windows_x86_64 → 单击下载最新版本。

User can move to bootstrap → 1.2 → windows_x86_64 → click to download latest version.

以下屏幕截图显示了上述步骤 -

The following screenshot displays the above step −

snowflake repository

现在,安装已下载的插件。安装完成后,在您的系统中打开 CMD。运行以下命令以检查连接:

Now, install the downloaded plugin. After installation, open CMD in your system. Run the following command to check connection −

snowsql -a <account_name> -u <username>

它会要求提供密码。输入您的 snowflake 密码,然后按 ENTER。您将看到连接成功。现在使用命令行 -

It will ask for password. Enter your snowflake password and press ENTER. You will see successful connection. Now use the command line −

<username>#<warehouse_name>@<db_name>.<schema_name>

现在使用以下命令将文件上传到 Snowflake 阶段:

Now use the following command to upload the file into snowflake' stage −

PUT file://C:/Users/*******/Documents/data.csv @csvstage;

别忘了在末尾加上“分号”符号,否则它将永远运行。

Don’t forget to put the "semicolon" sign at end, else it will run forever.

文件上载后,用户可以在工作表中运行以下命令 -

Once the file got uploaded, user can run the following command into Worksheet −

COPY INTO "TEST_DB"."TEST_SCHEMA_1"."TEST_TABLE" FROM @/csvstage ON_ERROR = 'ABORT_STATEMENT' PURGE = TRUE

数据将加载到表中。

Date will be loaded into the table.