Aws Quicksight 简明教程

AWS Quicksight - Using Data Sources

AWS Quicksight 接受来自各种来源的数据。一旦在主页上单击“新数据集”,它就会提供所有可用于的数据源选项。

AWS Quicksight accepts data from various sources. Once you click on “New Dataset” on the home page, it gives you options of all the data sources that can be used.

以下是包含所有内部和外部源列表的源−

Below are the sources containing the list of all internal and external sources −

using data source

让我们通过 Quicksight 连接一些最常用的数据源−

Let us go through connecting Quicksight with some of the most commonly used data sources −

Uploading a file from system

它只允许你输入 .csv、.tsv、.clf、.elf.xlsx 和 Json 格式文件。一旦你选择文件,Quicksight 会自动识别文件并显示数据。当你单击 Upload a File 按钮时,需要提供要用于创建数据集的文件位置。

It allows you to input .csv, .tsv, .clf,.elf.xlsx and Json format files only. Once you select the file, Quicksight automatically recognizes the file and displays the data. When you click on Upload a File button, you need to provide the location of file which you want to use to create dataset.

Using a file from S3 format

屏幕将如下显示。在数据源名称下,你可以输入要为所创建数据集显示的名称。此外,可能需要从本地系统上载清单文件或提供清单文件的 S3 位置。

The screen will appear as below. Under Data source name, you can enter the name to be displayed for the data set that would be created. Also you would require either uploading a manifest file from your local system or providing the S3 location of the manifest file.

data source name

清单文件是 JSON 格式文件,用于指定输入文件及其格式的 URL/位置。你可以输入多个输入文件(前提是格式相同)。以下是一个清单文件的示例。用于传递输入文件位置的“URI”参数是 S3。

Manifest file is a json format file, which specifies the url/location of input files and their format. You can enter more than one input files, provided the format is same. Here is an example of a manifest file. The “URI” parameter used to pass the location of input file is S3.

{
   "fileLocations": [
      {
         "URIs": [
            "url of first file",
            "url of second file",
            "url of 3rd file and so on"
         ]
      },

   ],
}
"globalUploadSettings": {
   "format": "CSV",
   "delimiter": ",",
   "textqualifier": "'",
   "containsHeader": "true"
}

globalUploadSettings 中传入的参数是默认参数。您可以根据您的要求更改这些参数。

The parameters passed in globalUploadSettings are the default ones. You can change these parameters as per your requirements.

MySQL

您需要在字段中输入数据库信息,以连接到您的数据库。一旦连接到您的数据库,您便可以从其中导入数据。

You need to enter the database information in the fields to connect to your database. Once it is connected to your database, you can import the data from it.

new sql data source

当您连接到任何 RDBMS 数据库时,需要以下信息−

Following information is required when you connect to any RDBMS database −

  1. DSN name

  2. Type of connection

  3. Database server name

  4. Port

  5. Database name

  6. User name

  7. Password

以下是 Quicksight 中支持的基于 RDBMS 的数据源−

Following RDBMS based data sources are supported in Quicksight −

  1. Amazon Athena

  2. Amazon Aurora

  3. Amazon Redshift

  4. Amazon Redshift Spectrum

  5. Amazon S3

  6. Amazon S3 Analytics

  7. Apache Spark 2.0 or later

  8. MariaDB 10.0 or later

  9. Microsoft SQL Server 2012 or later

  10. MySQL 5.1 or later

  11. PostgreSQL 9.3.1 or later

  12. Presto 0.167 or later

  13. Snowflake

  14. Teradata 14.0 or later

Athena

Athena 是 AWS 工具,用于对表运行查询。您可以在 Athena 中选择任意表,或对这些表运行自定义查询,并在 Quicksight 中使用这些查询的输出。选择数据源有几个步骤

Athena is the AWS tool to run queries on tables. You can choose any table from Athena or run a custom query on those tables and use the output of those queries in Quicksight. There are couple of steps to choose data source

当您选择 Athena 时,将出现以下屏幕。您可以输入您希望在 Quicksight 中向您的数据源提供的任何数据源名称。单击 “Validate Connection” 。验证连接后,单击 “Create new source” 按钮

When you choose Athena, below screen appears. You can input any data source name which you want to give to your data source in Quicksight. Click on “Validate Connection”. Once the connection is validated, click on the “Create new source” button

athena data source

现在从下拉列表中选择表名。下拉列表将显示 Athena 中存在的数据库,该数据库将进一步显示该数据库中的表。此外,您还可以单击 “Use custom SQL” 以对 Athena 表运行查询。

Now choose the table name from the dropdown. The dropdown will show the databases present in Athena which will further show tables in that database. Else you can click on “Use custom SQL” to run query on Athena tables.

select

完成后,您可以单击 “Edit/Preview data”“Visualize” 以编辑数据或根据您的要求直接可视化数据。

Once done, you can click on “Edit/Preview data” or “Visualize” to either edit your data or directly visualize the data as per your requirement.

finish data set creation

Deleting a data source

当您删除在任何 Quicksight 仪表板中使用的某个数据源时,它可能会导致关联的数据集不可用。通常在您查询基于 SQL 的数据源时发生这种情况。

When you delete a data source which is in use in any of the Quicksight dashboards, it can make associated data set unusable. It usually happens when you query a SQL based data source.

当您基于 S3, Sales force or SPICE 创建数据集时,它不会影响您使用任何数据集的能力,因为数据存储在 SPICE 中;然而在这种情况下不可用刷新选项。

When you create a dataset based on S3, Sales force or SPICE, it does not affect your ability to use any dataset as data is stored in SPICE; however refresh option is not available in this case.

要删除某个数据源,请选择该数据源。在创建数据集页面上导航到“来自现有数据源”选项卡。

To delete a data source, select the data source. Navigate to From Existing Data Source tab on creating a dataset page.

delete data source

在删除之前,您还可以确认预计表大小和数据源的其他详细信息。

Before deletion, you can also confirm estimated table size and other details of data source.

data source details