Elasticsearch 简明教程

Elasticsearch - Document APIs

Elasticsearch 提供了单文档 API 和多文档 API,其中 API 调用分别针对单个文档和多个文档。

Elasticsearch provides single document APIs and multi-document APIs, where the API call is targeting a single document and multiple documents respectively.

Index API

当针对特定映射向相应索引发出请求时,它有助于将 JSON 文档添加到索引中或在其中更新 JSON 文档。例如,以下请求会将 JSON 对象添加到索引学校和学校映射中 -

It helps to add or update the JSON document in an index when a request is made to that respective index with specific mapping. For example, the following request will add the JSON object to index schools and under school mapping −

PUT schools/_doc/5
{
   name":"City School", "description":"ICSE", "street":"West End",
   "city":"Meerut",
   "state":"UP", "zip":"250002", "location":[28.9926174, 77.692485],
   "fees":3500,
   "tags":["fully computerized"], "rating":"4.5"
}

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "5",
   "_version" : 1,
   "result" : "created",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 2,
   "_primary_term" : 1
}

Automatic Index Creation

当提出向特定索引添加 JSON 对象的请求且该索引不存在时,此 API 会自动创建该索引以及该特定 JSON 对象的基础映射。可以通过将 elasticsearch.yml 文件中存在的以下参数的值更改为 false 来禁用此功能。

When a request is made to add JSON object to a particular index and if that index does not exist, then this API automatically creates that index and also the underlying mapping for that particular JSON object. This functionality can be disabled by changing the values of following parameters to false, which are present in elasticsearch.yml file.

action.auto_create_index:false
index.mapper.dynamic:false

您还可以限制自动创建索引,其中仅允许具有特定模式的索引名称,方法是更改以下参数的值 -

You can also restrict the auto creation of index, where only index name with specific patterns are allowed by changing the value of the following parameter −

action.auto_create_index:+acc*,-bank*

Note - 此处 + 表示允许,- 表示不允许。

Note − Here + indicates allowed and – indicates not allowed.

Versioning

Elasticsearch 还提供了版本控制工具。我们可以使用 version 查询参数来指定特定文档的版本。

Elasticsearch also provides version control facility. We can use a version query parameter to specify the version of a particular document.

PUT schools/_doc/5?version=7&version_type=external
{
   "name":"Central School", "description":"CBSE Affiliation", "street":"Nagan",
   "city":"paprola", "state":"HP", "zip":"176115", "location":[31.8955385, 76.8380405],
   "fees":2200, "tags":["Senior Secondary", "beautiful campus"], "rating":"3.3"
}

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "5",
   "_version" : 7,
   "result" : "updated",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 3,
   "_primary_term" : 1
}

版本控制是一个实时过程,不会受到实时搜索操作的影响。

Versioning is a real-time process and it is not affected by the real time search operations.

有两种最重要的版本控制类型 -

There are two most important types of versioning −

Internal Versioning

内部版本控制是默认版本,从 1 开始,且每次更新都会增加,包括删除。

Internal versioning is the default version that starts with 1 and increments with each update, deletes included.

External Versioning

当文档版本存储在第三方版本控制系统等外部系统中时,它会被使用。要启用此功能,我们需要将 version_type 设置为 external。此处,Elasticsearch 会将由外部系统指定的版本号存储起来,且不会自动增加它们。

It is used when the versioning of the documents is stored in an external system like third party versioning systems. To enable this functionality, we need to set version_type to external. Here Elasticsearch will store version number as designated by the external system and will not increment them automatically.

Operation Type

操作类型用于强制创建操作。这有助于避免覆盖现有文档。

The operation type is used to force a create operation. This helps to avoid the overwriting of existing document.

PUT chapter/_doc/1?op_type=create
{
   "Text":"this is chapter one"
}

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "chapter",
   "_type" : "_doc",
   "_id" : "1",
   "_version" : 1,
   "result" : "created",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 0,
   "_primary_term" : 1
}

Automatic ID generation

当在索引操作中未指定 ID 时,Elasticsearch 会自动为此文档生成 id。

When ID is not specified in index operation, then Elasticsearch automatically generates id for that document.

POST chapter/_doc/
{
   "user" : "tpoint",
   "post_date" : "2018-12-25T14:12:12",
   "message" : "Elasticsearch Tutorial"
}

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "chapter",
   "_type" : "_doc",
   "_id" : "PVghWGoB7LiDTeV6LSGu",
   "_version" : 1,
   "result" : "created",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 1,
   "_primary_term" : 1
}

Get API

API 通过对特定文档执行获取请求来帮助提取类型 JSON 对象。

API helps to extract type JSON object by performing a get request for a particular document.

pre class="prettyprint notranslate" > GET schools/_doc/5

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "5",
   "_version" : 7,
   "_seq_no" : 3,
   "_primary_term" : 1,
   "found" : true,
   "_source" : {
      "name" : "Central School",
      "description" : "CBSE Affiliation",
      "street" : "Nagan",
      "city" : "paprola",
      "state" : "HP",
      "zip" : "176115",
      "location" : [
         31.8955385,
         76.8380405
      ],
      "fees" : 2200,
      "tags" : [
         "Senior Secondary",
         "beautiful campus"
      ],
      "rating" : "3.3"
   }
}
  1. This operation is real time and does not get affected by the refresh rate of Index.

  2. You can also specify the version, then Elasticsearch will fetch that version of document only.

  3. You can also specify the _all in the request, so that the Elasticsearch can search for that document id in every type and it will return the first matched document.

  4. You can also specify the fields you want in your result from that particular document.

GET schools/_doc/5?_source_includes=name,fees

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "5",
   "_version" : 7,
   "_seq_no" : 3,
   "_primary_term" : 1,
   "found" : true,
   "_source" : {
      "fees" : 2200,
      "name" : "Central School"
   }
}

您只需在 get 请求中添加 _source 部分,即可在结果中获取源代码部分。

You can also fetch the source part in your result by just adding _source part in your get request.

GET schools/_doc/5?_source

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "5",
   "_version" : 7,
   "_seq_no" : 3,
   "_primary_term" : 1,
   "found" : true,
   "_source" : {
      "name" : "Central School",
      "description" : "CBSE Affiliation",
      "street" : "Nagan",
      "city" : "paprola",
      "state" : "HP",
      "zip" : "176115",
      "location" : [
         31.8955385,
         76.8380405
      ],
      "fees" : 2200,
      "tags" : [
         "Senior Secondary",
         "beautiful campus"
      ],
      "rating" : "3.3"
   }
}

在执行 get 操作之前,您还可以通过将 refresh 参数设置为 true 来刷新分片。

You can also refresh the shard before doing get operation by set refresh parameter to true.

Delete API

您可以通过向 Elasticsearch 发送 HTTP DELETE 请求来删除特定索引、映射或文档。

You can delete a particular index, mapping or a document by sending a HTTP DELETE request to Elasticsearch.

DELETE schools/_doc/4

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "found":true, "_index":"schools", "_type":"school", "_id":"4", "_version":2,
   "_shards":{"total":2, "successful":1, "failed":0}
}

可以指定文档的版本,以删除该特定版本。可以指定路由参数,以从特定用户删除文档,如果文档不属于该特定用户,则该操作将失败。在此操作中,您可以指定 refresh 和 timeout 选项,与 GET API 相同。

Version of the document can be specified to delete that particular version. Routing parameter can be specified to delete the document from a particular user and the operation fails if the document does not belong to that particular user. In this operation, you can specify refresh and timeout option same like GET API.

Update API

脚本用于执行此操作,版本控制用于确保在 get 和重新索引期间没有更新发生。例如,您可以使用脚本更新学校费用 −

Script is used for performing this operation and versioning is used to make sure that no updates have happened during the get and re-index. For example, you can update the fees of school using script −

POST schools/_update/4
{
   "script" : {
      "source": "ctx._source.name = params.sname",
      "lang": "painless",
      "params" : {
         "sname" : "City Wise School"
      }
   }
 }

运行以上代码时,我们得到以下结果:-

On running the above code, we get the following result −

{
   "_index" : "schools",
   "_type" : "_doc",
   "_id" : "4",
   "_version" : 3,
   "result" : "updated",
   "_shards" : {
      "total" : 2,
      "successful" : 1,
      "failed" : 0
   },
   "_seq_no" : 4,
   "_primary_term" : 2
}

您可以通过向更新的文档发送 get 请求检查更新。

You can check the update by sending get request to the updated document.