Splunk 简明教程

Splunk - Quick Guide

Splunk - Overview

Splunk 是一款处理机器数据和其他形式大数据的软件,并从中提取见解。这种机器数据是由运行 Web 服务器的 CPU、IOT 设备、移动应用程序日志等生成的。没有必要将此数据提供给终端用户,也没有任何业务意义。但是,它们对于理解、监视和优化机器性能至关重要。

Splunk is a software which processes and brings out insight from machine data and other forms of big data. This machine data is generated by CPU running a webserver, IOT devices, logs from mobile apps, etc. It is not necessary to provide this data to the end users and does not have any business meaning. However, they are extremely important to understand, monitor and optimize the performance of the machines.

Splunk 可以读取此类非结构化、半结构化或很少结构化的数据。在读取数据后,它允许在这些数据上进行搜索、标记、创建报告和仪表板。随着大数据时代的到来,Splunk 现在能够从各种来源获取大数据,无论这些来源是否是机器数据,并在其上运行分析。

Splunk can read this unstructured, semi-structured or rarely structured data. After reading the data, it allows to search, tag, create reports and dashboards on these data. With the advent of big data, Splunk is now able to ingest big data from various sources, which may or may not be machine data and run analytics on big data.

因此,从一个用于日志分析的简单工具开始,Splunk 经历了漫长的过程,变得不结构化机器数据和各种形式大数据的通用分析工具。

So, from a simple tool for log analysis, Splunk has come a long way to become a general analytical tool for unstructured machine data and various forms of big data.

Product Categories

Splunk 可分为以下三种不同产品类别:

Splunk is available in three different product categories as follows −

  1. Splunk Enterprise − It is used by companies which have large IT infrastructure and IT driven business. It helps in gathering and analysing the data from websites, applications, devices and sensors, etc.

  2. Splunk Cloud − It is the cloud hosted platform with same features as the enterprise version. It can be availed from Splunk itself or through the AWS cloud platform.

  3. Splunk Light − It allows search, report and alert on all the log data in real time from one place. It has limited functionalities and features as compared to the other two versions.

Splunk Features

在本节中,我们将讨论企业版的特性:

In this section, we shall discuss the important features of enterprise edition −

Data Ingestion

Splunk 可以获取各种数据格式,如 JSON、XML 和非结构化机器数据,如 Web 和应用程序日志。非结构化数据可以根据用户的需要建模为数据结构。

Splunk can ingest a variety of data formats like JSON, XML and unstructured machine data like web and application logs. The unstructured data can be modeled into a data structure as needed by the user.

Data Indexing

Splunk 会对获取的数据进行索引,以便更快地在不同条件下进行搜索和查询。

The ingested data is indexed by Splunk for faster searching and querying on different conditions.

Data Searching

在 Splunk 中进行搜索涉及使用索引数据来创建度量、预测未来趋势并识别数据中的模式。

Searching in Splunk involves using the indexed data for the purpose of creating metrics, predicting future trends and identifying patterns in the data.

Using Alerts

当在正在分析的数据中发现某些特定标准时,可使用 Splunk 警报来触发电子邮件或 RSS 源。

Splunk alerts can be used to trigger emails or RSS feeds when some specific criteria are found in the data being analyzed.

Dashboards

Splunk 仪表板可以以图表、报表和透视表等形式显示搜索结果。

Splunk Dashboards can show the search results in the form of charts, reports and pivots, etc.

Data Model

索引数据可以根据专业领域知识建模成一个或多个数据集。这使得最终用户可以更轻松地导航,他们在分析业务案例时无需学习 Splunk 所使用的技术性搜索处理语言。

The indexed data can be modelled into one or more data sets that is based on specialized domain knowledge. This leads to easier navigation by the end users who analyze the business cases without learning the technicalities of the search processing language used by Splunk.

Splunk - Environment

在本教程中,我们旨在安装企业版本。此版本可免费试用 60 天,所有功能均已启用。你可以使用适用于 Windows 和 Linux 平台的以下链接下载设置。

In this tutorial, we will aim to install the enterprise version. This version is available for a free evaluation for 60 days with all features enabled. You can download the setup using the below link which is available for both windows and Linux platforms.

Linux Version

Linux 版本是从上面给出的下载链接下载的。我们选择 .deb 软件包类型,因为安装将在 Ubuntu 平台中进行。

The Linux version is downloaded from the download link given above. We choose the .deb package type as the installation will be done in a Ubuntu platform.

我们将按照分步方法学习:

We shall learn this with a step by step approach −

Step 1

如下图所示,下载 .deb 软件包 −

Download the .deb package as shown in the screenshot below −

linux install 1

Step 2

访问下载目录,并使用上述已下载包安装 Splunk。

Go to the download directory and install Splunk using the above downloaded package.

linux install 2

Step 3

接下来,您可以使用带有接受许可证参数的以下命令启动 Splunk。它会要求提供管理员用户名和密码,您应该提供并记住这些信息。

Next you can start Splunk by using the following command with accept license argument. It will ask for administrator user name and password which you should provide and remember.

linux install 3

Step 4

Splunk 服务器会启动,并提及可以访问 Splunk 界面 的 URL。

The Splunk server starts and mentions the URL where the Splunk interface can be accessed.

linux install 4

Step 5

现在,您可以访问 Splunk URL,并输入在第 3 步中创建的管理员用户 ID 和密码。

Now, you can access the Splunk URL and enter the admin user ID and password created in step 3.

linux install 5

Windows Version

Windows 版本可以作为 msi 安装程序获得,如下所示:

The windows version is available as a msi installer as shown in the below image −

install 1

双击 msi 安装程序,按照直接向前的方式安装 Windows 版本。为了成功安装,我们必须在以下两个重要步骤中做出正确选择。

Double clicking on the msi installer installs the Windows version in a straight forward process. The two important steps where we must make the right choice for successful installation are as follows.

Step 1

由于我们是在本地系统上安装,因此选择本地系统选项,如下所示:

As we are installing it on a local system, choose the local system option as given below −

install2

Step 2

输入管理员的密码并记住,因为它将在将来的配置中使用。

Enter the password for the administrator and remember it, as it will be used in the future configurations.

install3

Step 3

在最后一步,我们看到 Splunk 已成功安装,并且可以从 Web 浏览器启动。

In the final step, we see that Splunk is successfully installed and it can be launched from the web browser.

install4

Step 4

接下来,打开浏览器并输入给定的 url, http://localhost:8000 ,并使用管理员用户 ID 和密码登录到 Splunk。

Next, open the browser and enter the given url, http://localhost:8000, and login to the Splunk using the admin user ID and password.

install5

Splunk - Interface

Splunk Web 界面包含您搜索、报告和分析已导入数据的所需的所有工具。同一个 Web 界面提供用于管理用户及其角色的功能。它还提供 Splunk 中可用的用于数据导入和内置应用程序的链接。

The Splunk web interface consists of all the tools you need to search, report and analyse the data that is ingested. The same web interface provides features for administering the users and their roles. It also provides links for data ingestion and the in-built apps available in Splunk.

下图显示了使用管理员凭据登录 Splunk 后的初始屏幕。

The below picture shows the initial screen after your login to Splunk with the admin credentials.

interface 1

管理员下拉选项让用户设置和编辑管理员的详细信息。我们可以使用下面的屏幕重新设置管理员的电子邮件 ID 和密码−

The Administrator drop down gives the option to set and edit the details of the administrator. We can reset the admin email ID and password using the below screen −

interface 2

除管理员链接外,我们还可以导航到“偏好设置”选项,在该选项中,我们可以在登录后设置时区和打开登录页面的主页应用程序。目前,它打开“主页”,如下所示−

Further from the administrator link, we can also navigate to the preferences option where we can set the time zone and home application on which the landing page will open after your login. Currently, it opened on the Home page as shown below −

interface 3

这是一个链接,它显示 Splunk 中可用的所有核心功能。例如,您可以选择查找链接添加查找文件和查找定义。

This is a link which shows all the core features available in Splunk. For example, you can add the lookup files and lookup definitions by choosing the lookup link.

我们将在后续章节中讨论这些链接的重要设置。

We will discuss the important settings of these links in the subsequent chapters.

interface 4

查找和报告链接将我们带到一个页面,在这个页面上,我们可以找到可用于搜索为这些搜索创建的报告和警报的数据集。它在下面的屏幕截图中显示得很清楚−

The search and reporting link takes us to the features where we can find the data sets that are available for searching the reports and alerts created for these searches. It is clearly shown in the below screenshot −

interface 5

Splunk - Data Ingestion

通过属于搜索和报告应用程序一部分的 Add Data 特性,可以在 Splunk 中进行数据提取。登录后,Splunk 界面主屏幕会显示 Add Data 图标,如下所示。

Data ingestion in Splunk happens through the Add Data feature which is part of the search and reporting app. After logging in, the Splunk interface home screen shows the Add Data icon as shown below.

ingestion 1

单击此按钮后,屏幕上会显示选择要将数据推送到 Splunk 进行分析的数据源和格式。

On clicking this button, we are presented with the screen to select the source and format of the data we plan to push to Splunk for analysis.

Gathering The Data

我们可以从 Splunk 官方网站获取数据进行分析。保存此文件,并在你的本地驱动器中解压缩。打开该文件夹后,你可以看到三个格式各异的文件。它们是某些 Web 应用程序生成的对数数据。我们还可以在 Splunk 提供的官方 Splunk 网页集合另一组数据。

We can get the data for analysis from the Official Website of Splunk. Save this file and unzip it in your local drive. On opening the folder, you can find three files which have different formats. They are the log data generated by some web apps. We can also gather another set of data provided by Splunk which is available at from the Official Splunk webpage.

我们将使用来自这两个集合的数据了解 Splunk 各项特性的工作原理。

We will use data from both these sets for understanding the working of various features of Splunk.

Uploading data

接下来,从在上文所述的文件夹 mailsv 中选择文件 secure.log ,该文件已保存在本地系统中。选择文件后,使用右上角的绿色下一步按钮转到下一步。

Next, we choose the file, secure.log from the folder, mailsv which we have kept in our local system as mentioned in the previous paragraph. After selecting the file, we move to next step using the green coloured next button in the top right corner.

ingestion 2

Selecting Source Type

Splunk 具有内置特性来检测正在提取的数据类型。它还允许用户选择不同于 Splunk 所选的数据类型。单击源类型下拉菜单,我们就可以看到 Splunk 可以提取并启用以进行搜索的各种数据类型。

Splunk has an in-built feature to detect the type of the data being ingested. It also gives the user an option to choose a different data type than the chosen by Splunk. On clicking the source type drop down, we can see various data types that Splunk can ingest and enable for searching.

在下面所示的当前示例中,我们选择默认源类型。

In the current example given below, we choose the default source type.

ingestion 3

Input Settings

在此数据提取步骤中,我们配置提取数据的宿主名称。以下是主机名称可供选择的选项:

In this step of data ingestion, we configure the host name from which the data is being ingested. Following are the options to choose from, for the host name −

Constant value

这是源数据所在位置的完整宿主名称。

It is the complete host name where the source data resides.

regex on path

当你想使用正则表达式提取宿主名称时。然后在你想要在正则表达式字段中提取的主机中输入正则表达式。

When you want to extract the host name with a regular expression. Then enter the regex for the host you want to extract in the Regular expression field.

segment in path

当你想从数据源路径中的某个段中提取宿主名称时,在段号字段中输入段号。例如,如果源路径是 /var/log/,并且你希望第三个段(宿主服务器名称)作为宿主值,请输入“3”。

When you want to extract the host name from a segment in your data source’s path, enter the segment number in the Segment number field. For example, if the path to the source is /var/log/ and you want the third segment (the host server name) to be the host value, enter "3".

接下来,选择要针对输入数据创建的索引类型以供搜索。我们选择默认索引策略。摘要索引仅通过聚合创建数据的摘要,并在此基础上创建索引,而历史索引用于存储搜索历史记录。如下面的图像清楚地描绘的那样:

Next, we choose the index type to be created on the input data for searching. We choose the default index strategy. The summary index only creates summary of the data through aggregation and creates index on it while the history index is for storing the search history. It is clearly depicted in the image below −

ingestion 4

Review Settings

单击下一步按钮后,我们会看到我们所选设置的摘要。我们审阅它并选择下一步以完成数据上传。

After clicking on the next button, we see a summary of the settings we have chosen. We review it and choose Next to finish the uploading of data.

ingestion 5

完成加载后,会显示下面的屏幕,它显示数据提取成功以及针对数据可以采取的进一步可能的措施。

On finishing the load, the below screen appears which shows the successful data ingestion and further possible actions we can take on the data.

ingestion 6

Splunk - Source Types

所有传入 Splunk 的数据首先由其内置数据处理单元进行判断,并分类为某些数据类型和类别。例如,如果它来自 Apache Web 服务器的日志,Splunk 能够识别它,并从读取的数据中创建适当的字段。

All the incoming data to Splunk are first judged by its inbuilt data processing unit and classified to certain data types and categories. For example, if it is a log from apache web server, Splunk is able to recognize that and create appropriate fields out of the data read.

Splunk 中的这一功能被称为“来源类型检测”,它利用其内置的、被称作“预训练模型”的来源类型来实现此目的。

This feature in Splunk is called source type detection and it uses its built-in source types that are known as "pretrained" source types to achieve this.

这让分析变得更为简单,因为用户不必手动对数据进行分类和将任何类型的数据分配给传入数据的字段。

This makes things easier for analysis as the user does not have to manually classify the data and assign any data types to the fields of the incoming data.

Supported Source Types

可以通过 Add Data 功能上传文件,然后选择来源类型的下拉列表,查看 Splunk 中支持的来源类型。在下图中,我们上传了一个 CSV 文件,然后选中了所有可用选项。

The supported source types in Splunk can be seen by uploading a file through the Add Data feature and then selecting the dropdown for Source Type. In the below image, we have uploaded a CSV file and then checked for all the available options.

image::https://www.iokays.com/tutorialspoint/splunk/_images/source_type_1.jpg [Source Type1]

Source Type Sub-Category

即便在那些类别中,我们也可以进一步点击查看支持的所有子类别。因此,当选择数据库类别时,你会发现 Splunk 可以识别的不同类型的数据库及它们支持的文件。

Even in those categories, we can further click to see all the sub categories that are supported. So when you choose the database category, you can find the different types of databases and their supported files which Splunk can recognize.

image::https://www.iokays.com/tutorialspoint/splunk/_images/source_type_2.jpg [Source Type2]

Pre-Trained Source Types

下表列出了 Splunk 识别的部分重要预训练来源类型:

The below table lists some of the important pre-trained source types Splunk recognizes −

Source Type Name

Nature

access_combined

NCSA combined format http web server logs (can be generated by apache or other web servers)

access_combined_wcookie

NCSA combined format http web server logs (can be generated by apache or other web servers), with cookie field added at end

apache_error

Standard Apache web server error log

linux_messages_syslog

Standard linux syslog (/var/log/messages on most platforms)

log4j

Log4j standard output produced by any J2EE server using log4j

mysqld_error

Standard mysql error log

Splunk 具有强大的搜索功能,使您能够搜索已摄取的整个数据集。通过命名为 Search & Reporting 的应用程序访问此功能,在登录到 Web 界面后,可以在左侧栏中看到此应用程序。

Splunk has a robust search functionality which enables you to search the entire data set that is ingested. This feature is accessed through the app named as Search & Reporting which can be seen in the left side bar after logging in to the web interface.

basic search 1

单击 search & Reporting 应用程序后,我们将看到一个搜索框,我们可以在其中开始搜索我们在上一章中上传的日志数据。

On clicking on the search & Reporting app, we are presented with a search box, where we can start our search on the log data that we uploaded in the previous chapter.

我们以如下所示的格式输入主机名,然后单击最右侧的搜索图标。这将显示突出显示搜索词组的结果。

We type the host name in the format as shown below and click on the search icon present in the right most corner. This gives us the result highlighting the search term.

basic search 2

Combining Search Terms

我们可以一个接一个地编写用于搜索的词组,将它们组合起来,但将用户搜索字符串放在双引号中。

We can combine the terms used for searching by writing them one after another but putting the user search strings under double quotes.

basic search 3

Using Wild Card

我们可以在搜索选项中使用与 AND/OR 运算符结合使用的通配符。在下面的搜索中,我们得到了一个结果,其中日志文件包含 fail、failed、failure 等词组,以及同一行中的密码词组。

We can use wild cards in our search option combined with the AND/OR operators. In the below search, we get the result where the log file has the terms containing fail, failed, failure, etc., along with the term password in the same line.

basic search 4

Refining Search Results

我们可以通过选择一个字符串并将其添加到搜索中,进一步优化搜索结果。在下面的示例中,我们单击字符串 3351 并选择选项 Add to Search

We can further refine the search result by selecting a string and adding it to the search. In the below example, we click over the string 3351 and select the option Add to Search.

3351 被添加到搜索词组后,我们得到了下面的结果,其中仅显示日志中包含 3351 的行。另外,请注意,随着我们优化搜索,搜索结果的时间线也有所变化。

After 3351 is added to the search term, we get the below result which shows only those lines from the log containing 3351 in them. Also mark how the time line of the search result has changed as we have refined the search.

basic search 6

Splunk - Field Searching

当 Splunk 读入已上传的机器数据时,它会解析这些数据,并按字段将数据分成许多部分,每个字段都将表示整个数据记录中的一个单一的逻辑事实。

When Splunk reads the uploaded machine data, it interprets the data and divides it into many fields which represent a single logical fact about the entire data record.

例如,一个单一的记录信息可能包含服务器名称、事件的时间戳、正在记录的事件类型(登录尝试或 HTTP 响应等)。即使是对于非结构化数据,Splunk 也尝试将字段分成键值对,或者根据数据的类型(数字、字符串等)将字段分开。

For example, a single record of information may contain server name, timestamp of the event, type of the event being logged whether login attempt or a http response, etc. Even in case of unstructured data, Splunk tries to divide the fields into key value pairs or separate them based on the data types they have, numeric and string, etc.

继续对上一章节中上传的数据进行操作,我们可以通过点击显示字段链接,看到 secure.log 文件中的字段,这将打开以下屏幕。我们可以注意到 Splunk 从这个日志文件中生成出来了哪些字段。

Continuing with the data uploaded in the previous chapter, we can see the fields from the secure.log file by clicking on the show fields link which will open up the following screen. We can notice the fields Splunk has generated from this log file.

field search 1

Choosing the Fields

我们可以通过从所有字段列表中选择或取消选择来选择显示哪些字段。点击 all fields ,将打开一个显示所有字段的列表的窗口。其中一些字段前面会有勾选标记,表明它们已被选中。我们可以使用复选框来选择要显示的字段。

We can choose what fields to be displayed by selecting or unselecting the fields from the list of all fields. Clicking on all fields opens a window showing the list of all the fields. Some of these fields have check marks against them showing they are already selected. We can use the check boxes to choose our fields for display.

除了字段名称外,它还显示了字段中不同的值的数量、数据类型,以及此字段存在于多少百分比的事件中。

Besides the name of the field, it displays the number of distinct values the fields have, its data type and what percentage of events this field is present in.

field search 2

Field Summary

可以点击字段名称,查看每个所选字段非常详细的统计信息。它显示了该字段的所有不同值、它们的计数以及它们的百分比。

Very detailed stats for every selected field become available by clicking on the name of the field. It shows all the distinct values for the field, their count and their percentages.

field search 3

还可以将字段名称和用于搜索的特定值一起插入到搜索框中。在下例中,我们的目标是找到名为 mailsecure_log 的主机的 10 月 15 日的所有记录。我们获得了此特定日期的结果。

The field names can also be inserted into the search box along with the specific values for the search. In the below example, we aim to find all the records for the date, 15th Oct for the host named mailsecure_log. We get the result for this specific date.

field search 4

Splunk Web 界面会显示时间轴,该时间轴会指示一系列时间段中事件的分布情况。提供了预设时间间隔,您可以从中选择特定时间范围,也可以根据需要自定义时间范围。

The Splunk web interface displays timeline which indicates the distribution of events over a range of time. There are preset time intervals from which you can select a specific time range, or you can customize the time range as per your need.

以下屏幕显示各种预设时间轴选项。选择其中任何选项只会获取该特定时间段的数据,您还可以使用可用的自定义时间轴选项进一步分析该数据。

The below screen shows various preset timeline options. Choosing any of these options will fetch the data for only that specific time period which you can also analyse further, using the custom timeline options available.

time range search 1

例如,选择上个月的选项将仅为我们提供上个月的结果,如您在下面的时间轴图中看到的范围所示。

For example, choosing the previous month option gives us the result only for the previous month as you can see the in spread of the timeline graph below.

time range search 2

Selecting a Time Subset

通过单击并拖动时间轴中的条形,我们可以选择已存在的子集结果。这不会导致查询重新执行。它只会从现有结果集中过滤掉记录。

By clicking and dragging across the bars in the timeline, we can select a subset of the result that already exists. This does not cause the re-execution of the query. It only filters out the records from the existing result set.

下图显示了从结果集中选择子集的情况:

Below image shows the selection of a subset from the result set −

time range search 3

Earliest and Latest

可以在搜索栏中使用 earliest 和 latest这两个命令来指示在其中筛选结果的时间范围。这类似于选择时间子集,但它是通过命令而不是在特定时间轴栏中单击的选项进行的。因此,它可以更精细地控制您可以针对分析选择的数据范围。

The two commands, earliest and latest can be used in the search bar to indicate the time range in between which you filter out the results. It is similar to selecting the time subset, but it is through commands rather than the option of clicking at a specific time line bar. So, it provides a finer control over that data range you can pick for your analysis.

time range search 4

在上图中,我们在最近 7 天到最近 15 天之间给出了一个时间范围。因此,这两天之间的数据会显示出来。

In the above image, we give a time range between last 7 days to last 15 days. So, the data in between these two days is displayed.

Nearby Events

我们还可以通过提及希望按多近的距离来筛选事件来,找到具体时间的附近事件。我们有选择间隔单位的选项,例如——秒、分钟、天和周等。

We can also find nearby events of a specific time by mentioning how close we want the events to be filtered out. We have the option of choosing the scale of the interval, like – seconds, minutes, days and week etc.

Splunk - Sharing Exporting

当您运行搜索查询时,结果将作为作业存储在 Splunk 服务器中。虽然此作业是由一个特定用户创建的,但它可以与其他用户共享,以便他们可以开始使用此结果集,而无需再次为此构建查询。结果还可以导出并保存为文件,可以与不使用 Splunk 的用户共享。

When you run a search query, the result is stored as a job in the Splunk server. While this job was created by one specific user, it can be shared across with other users so that they can start using this result set without the necessity of building the query for it again. The results can also be exported and saved as files which can be shared with users who do not use Splunk.

Sharing the Search Result

查询成功运行后,我们可以在网页中间右方看到一个小的向上箭头。单击此图标将提供一个 URL,可以在其中访问查询和结果。需要向将使用此链接的用户授予权限。权限通过 Splunk 管理界面授予。

Once a query has run successfully, we can see a small upward arrow in the middle right of the web page. Clicking on this icon gives a URL where the query and the result can be accessed. There is a need to grant permission to the users who will be using this link. Permission is granted through the Splunk administration interface.

image::https://www.iokays.com/tutorialspoint/splunk/_images/share_export_1.jpg [Share Export1]

Finding the Saved Results

可以找到适当权限供所有用户使用的已保存作业,方法是在 Splunk 界面右上角的活动菜单下查找作业链接。在下图中,我们单击名为作业的突出显示链接以查找已保存作业。

The jobs that are saved to be used by all users with appropriate permissions can be located by looking for the jobs link under the activity menu in the top right bar of the Splunk interface. In the below image, we click on the highlighted link named jobs to find the saved jobs.

image::https://www.iokays.com/tutorialspoint/splunk/_images/share_export_3.jpg [Share Export3]

单击上述链接后,我们将获得所有已保存作业的列表,如下所示。他,我们必须注意,在自动将已保存作业从 Splunk 中删除的发布日期之后。您可以通过选择作业并单击编辑选定的然后选择延长到期时间来调整日期。

After the above link is clicked, we get the list of all the saved jobs as shown below. He, we have to note that there is an expiry date post where the saved job will automatically get removed from Splunk. You can adjust this date by selecting the job and clicking on Edit selected and then choosing Extend Expiration.

image::https://www.iokays.com/tutorialspoint/splunk/_images/share_export_4.jpg [Share Export4]

Exporting the Search Result

我们还可以将搜索结果导出到一个文件中。可用于导出的三种不同格式是:CSV、XML 和 JSON。选择格式后单击导出按钮会在本地系统上从本地浏览器下载文件。这在下图中进行了说明 −

We can also export the results of a search into a file. The three different formats available for export are: CSV, XML and JSON. Clicking on the Export button after choosing the formats downloads the file from the local browser into the local system. This is explained in the below image −

image::https://www.iokays.com/tutorialspoint/splunk/_images/share_export_2.jpg [Share Export2]

Splunk - Search Language

Splunk 搜索处理语言(SPL)是一种语言,其中包含许多命令、函数、参数等,这些命令、函数、参数被用来从数据集得到期望的结果。例如,当针对搜索词获得结果集时,你可能希望进一步从结果集中筛选一些更具体的内容。为此,你需要向现有命令添加一些额外的命令。这是通过学习 SPL 的用法来实现的。

The Splunk Search Processing Language (SPL) is a language containing many commands, functions, arguments, etc., which are written to get the desired results from the datasets. For example, when you get a result set for a search term, you may further want to filter some more specific terms from the result set. For this, you need some additional commands to be added to the existing command. This is achieved by learning the usage of SPL.

Components of SPL

SPL 包含以下组件。

The SPL has the following components.

  1. Search Terms − These are the keywords or phrases you are looking for.

  2. Commands − The action you want to take on the result set like format the result or count them.

  3. Functions − What are the computations you are going to apply on the results. Like Sum, Average etc.

  4. Clauses − How to group or rename the fields in the result set.

让我们在下面的部分使用图片讨论所有组件 −

Let us discuss all the components with the help of images in the below section −

Search Terms

这些是在搜索栏中提及的术语,用于从数据集获取满足搜索条件的特定记录。在下例中,我们搜索包含两个突出显示术语的记录。

These are the terms you mention in the search bar to get specific records from the dataset which meet the search criteria. In the below example, we are searching for records which contain two highlighted terms.

spl 1

Commands

可以使用 SPL 提供的许多内置命令来简化分析结果集中数据的过程。在下例中,我们使用 head 命令仅从搜索操作中筛选出前 3 个结果。

You can use many in-built commands that SPL provides to simplify the process of analysing the data in the result set. In the below example we use the head command to filter out only the top 3 results from a search operation.

spl 2

Functions

除了命令,Splunk 还提供许多内置函数,这些函数可以从被分析的字段获取输入,并在对该字段应用计算后给出输出。在下例中,我们使用 *Stats avg() *函数来计算作为输入的数字字段的平均值。

Along with commands, Splunk also provides many in-built functions which can take input from a field being analysed and give the output after applying the calculations on that field. In the below example, we use the *Stats avg() *function which calculates the average value of the numeric field being taken as input.

spl 3

Clauses

当我们希望按某个特定字段获取分组结果或希望在输出中重命名字段时,我们分别使用 group by 从句和 as 从句。在下例中,我们获得了 web_application 日志中每个文件的平均字节大小。如您所见,结果显示了每个文件的名称以及每个文件的平均字节数。

When we want to get results grouped by some specific field or we want to rename a field in the output, we use the group by clause and the as clause respectively. In the below example, we get the average size of bytes of each file present in the web_application log. As you can see, the result shows the name of each file as well as the average bytes for each file.

spl 4