Splunk 简明教程

Splunk - Pivot and Datasets

Splunk 可以摄取不同类型的数据源并构建类似于关系表的表。这些称为 table dataset 或仅仅称为 tables 。它们提供了一种简单的方法来分析和筛选数据和查找等。这些表数据集也用于创建枢轴分析，我们将在本章中学习。

Splunk can ingest different types of data sources and build tables which are similar to relational tables. These are called table dataset or just tables. They provide easy ways to analyse and filter the data and lookups, etc. These table data sets are also used in creating pivot analysis which we learn in this chapter.

Creating a Dataset

我们使用名为 Splunk 数据集插件的 Splunk 附加组件来创建和管理数据集。可以在 Splunk 网站下载，链接：https://splunkbase.splunk.com/app/3245/ /details[https://splunkbase.splunk.com/app/3245/ /details。]必须按照此链接中的 Details 选项卡中提供的说明进行安装。安装成功后，我们会看到一个名为 Create New Table Dataset 的按钮。

We use a Splunk Add-on named Splunk Datasets Add-on to create and manage the datasets. It can be downloaded from the Splunk website, https://splunkbase.splunk.com/app/3245//details. It has to be installed by following the instructions given in the details tab in this link. On successful installation, we see a button named Create New Table Dataset.

Selecting a Dataset

接下来，我们单击 Create New Table Dataset 按钮，它让我们可以选择以下三个选项。

Next, we click on the Create New Table Dataset button and it gives us the option to choose from the below three options.

Indexes and Source Types − Choose from an existing index or source type which are already added to Splunk through Add Data app.
Existing Datasets − You might have already created some dataset previously which you want to modify by creating a new dataset from it.
Search − Write a search query and the result can be used to create a new dataset.

在我们的示例中，我们选择了一个索引作为我们的数据集源，如下图所示 −

In our example, we choose an index to be our source of data set as shown in the image below −

Choosing Dataset Fields

点击上屏的“确定”，将会向我们提供一个选项，让我们选择最终想要放入表数据集的各种字段。_time 字段默认选中，该字段不可删除。我们选择以下字段： bytes, categoryID, clientIP 和 files 。

On clicking OK in the above screen, we are presented with an option to choose the various fields we want to finally get into the Table Dataset. The _time field is selected by default and this field cannot be dropped. We choose the fields: bytes, categoryID, clientIP and files.

点击上屏中的“完成”，我们将会获得最终数据集表，其中包含所有选定字段，如下所示。此处，数据集已变得类似于一个关系表。我们使用右上角的 save as 选项保存数据集。

On clicking done in the above screen, we get the final dataset table with all the selected fields, as seen below. Here the dataset has become similar to a relational table. We save the dataset with save as option available in the top right corner.

Creating Pivot

我们使用上方的数据集来创建一个透视报告。透视报告反映了基于另一列中的值，对一列的值进行聚合。换言之，一列的值变成行，而另一列的值变成行。

We use the above dataset to create a pivot report. The pivot report reflects aggregation of values of one column with respect to the values in another column. In other words, one columns values are made into rows and another columns values are made into rows.

Choose Dataset Action

要实现此目的，我们首先使用数据集选项卡选择数据集，然后为该数据集从“操作”列中选择 Visualize with Pivot 选项。

To achieve this, we first select the dataset using the dataset tab and then choose the option Visualize with Pivot from the Actions column for that data set.

Choose the Pivot Fields

接下来，我们选择用于创建透视表的适当字段。我们选择 split columns 选项中的“类别 ID”，因为该字段的值应在报表中显示为不同的列。然后我们在 Split Rows 选项中选择“文件”，因为该字段的值应显示在行中。结果显示了 file 字段中每个值的每个 categoryid 值的计数。

Next, we choose the appropriate fields for creating the pivot table. We choose category ID in the split columns option as this is the field whose values should appear as different columns in the report. Then we choose File in the Split Rows option as this is the field whose values should be presented in rows. The result shows count of each categoryid values for each value in the file field.

接下来，我们可以将透视表保存为报表或现有仪表板中的面板，以供将来参考。

Next, we can save the pivot table as a Report or a panel in an existing dashboard for future reference.