Miscellaneous Elasticsearch Operation Support

索引设置定义（通过 @Setting 注释）
索引映射定义（通过 @Mapping 注释）
过滤器构建器
用于获取大型结果集的滚动 API
提供自定义排序选项
运行时字段（从 Elasticsearch 7.12 开始）
时间点 API
搜索模板支持
嵌套排序

本章包含对无法通过存储库接口直接访问的 Elasticsearch 操作的额外支持。建议按照 repositories/custom-implementations.adoc 中所述添加这些操作作为自定义实现。

Index settings

在使用 Spring Data Elasticsearch 创建 Elasticsearch 索引时，可以使用 @Setting 注解来定义不同的索引设置。以下参数可用：

useServerConfiguration 不发送任何设置参数，所以 Elasticsearch 服务器配置确定它们。
settingPath 指向一个 JSON 文件，该文件定义必须在类路径中可解析的设置
shards 要使用的分片数目，默认为 1
replicas 副本数目，默认为 1
refreshIntervall, defaults to "1s"
indexStoreType, defaults to "fs"

也可以定义 index sorting（查看链接的 Elasticsearch 文档以了解可能的字段类型和值）：

@Document(indexName = "entities")
@Setting(
  sortFields = { "secondField", "firstField" },                                  1
  sortModes = { Setting.SortMode.max, Setting.SortMode.min },                    2
  sortOrders = { Setting.SortOrder.desc, Setting.SortOrder.asc },
  sortMissingValues = { Setting.SortMissing._last, Setting.SortMissing._first })
class Entity {
    @Nullable
    @Id private String id;

    @Nullable
    @Field(name = "first_field", type = FieldType.Keyword)
    private String firstField;

    @Nullable @Field(name = "second_field", type = FieldType.Keyword)
    private String secondField;

    // getter and setter...
}

1	定义排序字段时，使用 Java 属性（firstField）的名称，而不是 Elasticsearch 可能定义的名称（first_field）
2	`sortModes`、`sortOrders` 和 `sortMissingValues` 是可选的，但是如果设置了这些选项，则其条目数必须与 `sortFields` 元素数目匹配

Index Mapping

当 Spring Data Elasticsearch 使用 `IndexOperations.createMapping()`方法创建索引映射时，它使用 Mapping Annotation Overview中描述的注解，尤其是 `@Field`注解。除此之外，还可以向类添加 `@Mapping`注解。此注解具有以下属性：

mappingPath JSON 格式的类路径资源；如果此项不为空，它将用作映射，不会进行其他映射处理。
enabled 当将其设置为 false 时，此标志会写入到映射中并且不会进行其他处理。
dateDetection 和 numericDetection 在没有设置为 DEFAULT 时，会在映射中设置对应的属性。
dynamicDateFormats 当此字符串数组不为空时，它定义用于自动日期检测的日期格式。
runtimeFieldsPath JSON 格式的类路径资源，包含写入到索引映射中的运行时字段定义，例如：

{
  "day_of_week": {
    "type": "keyword",
    "script": {
      "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
    }
  }
}

Filter Builder

过滤器构建器提高了查询速度。

private ElasticsearchOperations operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query query = NativeQuery.builder()
	.withQuery(q -> q
		.matchAll(ma -> ma))
	.withFilter( q -> q
		.bool(b -> b
			.must(m -> m
				.term(t -> t
					.field("id")
					.value(documentId))
			)))
	.build();

SearchHits<SampleEntity> sampleEntities = operations.search(query, SampleEntity.class, index);

Using Scroll For Big Result Set

Elasticsearch 具有一个滚动 API，用于块状获取大的结果集。Spring Data Elasticsearch 内部使用它来提供 <T> SearchHitsIterator<T> SearchOperations.searchForStream(Query query, Class<T> clazz, IndexCoordinates index) 方法的实现。

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query searchQuery = NativeQuery.builder()
    .withQuery(q -> q
        .matchAll(ma -> ma))
    .withFields("message")
    .withPageable(PageRequest.of(0, 10))
    .build();

SearchHitsIterator<SampleEntity> stream = elasticsearchOperations.searchForStream(searchQuery, SampleEntity.class,
index);

List<SampleEntity> sampleEntities = new ArrayList<>();
while (stream.hasNext()) {
  sampleEntities.add(stream.next());
}

stream.close();

如果必须访问滚动 ID，则 SearchOperations API 中没有可以访问此 ID 的方法，但我可以使用 AbstractElasticsearchTemplate 的以下方法（这是不同 ElasticsearchOperations 实现的基本实现）：

@Autowired ElasticsearchOperations operations;

AbstractElasticsearchTemplate template = (AbstractElasticsearchTemplate)operations;

IndexCoordinates index = IndexCoordinates.of("sample-index");

Query query = NativeQuery.builder()
    .withQuery(q -> q
        .matchAll(ma -> ma))
    .withFields("message")
    .withPageable(PageRequest.of(0, 10))
    .build();

SearchScrollHits<SampleEntity> scroll = template.searchScrollStart(1000, query, SampleEntity.class, index);

String scrollId = scroll.getScrollId();
List<SampleEntity> sampleEntities = new ArrayList<>();
while (scroll.hasSearchHits()) {
  sampleEntities.addAll(scroll.getSearchHits());
  scrollId = scroll.getScrollId();
  scroll = template.searchScrollContinue(scrollId, 1000, SampleEntity.class);
}
template.searchScrollClear(scrollId);

要将滚动 API 与存储库方法一起使用，必须在 Elasticsearch 存储库中将返回类型定义为 Stream。然后，该方法的实现将使用 ElasticsearchTemplate 的滚动方法。

interface SampleEntityRepository extends Repository<SampleEntity, String> {

    Stream<SampleEntity> findBy();

}

Sort options

除了 Paging and Sorting中描述的默认排序选项之外，Spring Data Elasticsearch 还有类 org.springframework.data.elasticsearch.core.query.Order，它派生自 org.springframework.data.domain.Sort.Order。它提供了在指定结果排序时可以发送到 Elasticsearch 的其他参数（请参见 [role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/7.15/sort-search-results.html）。

还有 org.springframework.data.elasticsearch.core.query.GeoDistanceOrder 类，可用于将搜索操作的结果按地理距离排序。

如果要检索的类具有名为 location 的 GeoPoint 属性，则以下 Sort 将按与给定点的距离对结果进行排序：

Sort.by(new GeoDistanceOrder("location", new GeoPoint(48.137154, 11.5761247)))

Runtime Fields

从 7.12 版开始，Elasticsearch 增加了运行时字段 ([role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime.html) 功能。Spring Data Elasticsearch 以两种方式支持此功能：

Runtime field definitions in the index mappings

定义运行时字段的第一种方法是将定义添加到索引映射（请参见 [role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-mapping-fields.html）。要在 Spring Data Elasticsearch 中使用这种方法，用户必须提供包含相应定义的 JSON 文件，例如：

Example 1. runtime-fields.json

{
  "day_of_week": {
    "type": "keyword",
    "script": {
      "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
    }
  }
}

然后，必须在实体的 @Mapping 注释中设置此 JSON 文件的路径，该 JSON 文件必须存在于类路径中：

@Document(indexName = "runtime-fields")
@Mapping(runtimeFieldsPath = "/runtime-fields.json")
public class RuntimeFieldEntity {
	// properties, getter, setter,...
}

Runtime fields definitions set on a Query

定义运行时字段的第二种方法是将定义添加到搜索查询（请参见 [role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/7.12/runtime-search-request.html）。以下代码示例展示了如何利用 Spring Data Elasticsearch 执行此操作：

使用的实体是一个具有 price 属性的简单对象：

@Document(indexName = "some_index_name")
public class SomethingToBuy {

	private @Id @Nullable String id;
	@Nullable @Field(type = FieldType.Text) private String description;
	@Nullable @Field(type = FieldType.Double) private Double price;

	// getter and setter
}

以下查询使用运行时字段，通过向价格添加 19% 从而计算 priceWithTax 值，并在搜索查询中使用此值来查找 priceWithTax 高于或等于给定值的所有实体：

RuntimeField runtimeField = new RuntimeField("priceWithTax", "double", "emit(doc['price'].value * 1.19)");
Query query = new CriteriaQuery(new Criteria("priceWithTax").greaterThanEqual(16.5));
query.addRuntimeField(runtimeField);

SearchHits<SomethingToBuy> searchHits = operations.search(query, SomethingToBuy.class);

适用于 Query 接口的所有实现。

Point In Time (PIT) API

ElasticsearchOperations 支持 Elasticsearch 的时间点 API（请参见 [role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/8.3/point-in-time-api.html）。以下代码片段展示了如何将此功能与一个虚构的 Person 类结合使用：

ElasticsearchOperations operations; // autowired
Duration tenSeconds = Duration.ofSeconds(10);

String pit = operations.openPointInTime(IndexCoordinates.of("person"), tenSeconds); 1

// create query for the pit
Query query1 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Smith"))
    .withPointInTime(new Query.PointInTime(pit, tenSeconds))                        2
    .build();
SearchHits<Person> searchHits1 = operations.search(query1, Person.class);
// do something with the data

// create 2nd query for the pit, use the id returned in the previous result
Query query2 = new CriteriaQueryBuilder(Criteria.where("lastName").is("Miller"))
    .withPointInTime(
        new Query.PointInTime(searchHits1.getPointInTimeId(), tenSeconds))          3
    .build();
SearchHits<Person> searchHits2 = operations.search(query2, Person.class);
// do something with the data

operations.closePointInTime(searchHits2.getPointInTimeId());                        4

1	为一个索引（可以是多个名称）创建一个时间点，以及一个保持活动持续时间并检索其 ID
2	将该 ID 传递到查询中以随下一个保持活动值一同进行搜索
3	对于下一个查询，使用上次搜索返回的 ID
4	完成后，使用上次返回的 ID 关闭时间点

Search Template support

支持使用搜索模板 API。若要使用此 API，首先需要创建一个存储的脚本。ElasticsearchOperations 接口扩展了 ScriptOperations，后者提供了必要的功能。这里使用的示例假设我们有一个 Person 实体，其属性名为 firstName。搜索模板脚本可以这样保存：

import org.springframework.data.elasticsearch.core.ElasticsearchOperations;
import org.springframework.data.elasticsearch.core.script.Script;

operations.putScript(                            1
  Script.builder()
    .withId("person-firstname")                  2
    .withLanguage("mustache")                    3
    .withSource("""                              4
      {
        "query": {
          "bool": {
            "must": [
              {
                "match": {
                  "firstName": "{{firstName}}"   5
                }
              }
            ]
          }
        },
        "from": "{{from}}",                      6
        "size": "{{size}}"                       7
      }
      """)
    .build()
);

1	使用 `putScript()` 方法来存储搜索模板脚本
2	脚本的名称/ID
3	在搜索模板中使用的脚本必须使用 mustache 语言。
4	The script source
5	脚本中的搜索参数
6	Paging request offset
7	Paging request size

为了在搜索查询中使用搜索模板，Spring Data Elasticsearch 提供了 SearchTemplateQuery，它是 org.springframework.data.elasticsearch.core.query.Query 接口的一个实现。

在以下代码中，我们将添加使用搜索模板查询调用自定义存储库实现的示例（参见 repositories/custom-implementations.adoc），说明如何将其集成到存储库调用中。

首先，我们定义自定义仓库片段接口：

interface PersonCustomRepository {
	SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable);
}

此仓库片段的实现如下：

public class PersonCustomRepositoryImpl implements PersonCustomRepository {

  private final ElasticsearchOperations operations;

  public PersonCustomRepositoryImpl(ElasticsearchOperations operations) {
    this.operations = operations;
  }

  @Override
  public SearchPage<Person> findByFirstNameWithSearchTemplate(String firstName, Pageable pageable) {

    var query = SearchTemplateQuery.builder()                               1
      .withId("person-firstname")                                           2
      .withParams(
        Map.of(                                                             3
          "firstName", firstName,
          "from", pageable.getOffset(),
          "size", pageable.getPageSize()
          )
      )
      .build();

    SearchHits<Person> searchHits = operations.search(query, Person.class); 4

    return SearchHitSupport.searchPageFor(searchHits, pageable);
  }
}

1	Create a `SearchTemplateQuery`
2	提供搜索模板的 ID
3	参数传递为 `Map<String,Object>`
4	以与其他查询类型相同的方式执行搜索。

Nested sort

Spring Data Elasticsearch 支持在嵌套对象中进行排序（[role="bare"][role="bare"]https://www.elastic.co/guide/en/elasticsearch/reference/8.9/sort-search-results.html#nested-sorting）

以下示例取自 org.springframework.data.elasticsearch.core.query.sort.NestedSortIntegrationTests 类，展示了如何定义嵌套排序。

var filter = StringQuery.builder("""
	{ "term": {"movies.actors.sex": "m"} }
	""").build();
var order = new org.springframework.data.elasticsearch.core.query.Order(Sort.Direction.DESC,
	"movies.actors.yearOfBirth")
	.withNested(
		Nested.builder("movies")
			.withNested(
				Nested.builder("movies.actors")
					.withFilter(filter)
					.build())
			.build());

var query = Query.findAll().addSort(Sort.by(order));

有关过滤器查询：无法在此处使用 CriteriaQuery，因为此查询将转换为 Elasticsearch 嵌套查询，而该查询在过滤器上下文中不起作用。因此，此处只能使用 StringQuery 或 NativeQuery。使用其中一个时，如上文的术语查询，必须使用 Elasticsearch 字段名称，因此请小心，使用 @Field(name="…") 定义重新定义这些名称时。

必须使用 Java 实体属性名称作为顺序路径和嵌套路径的定义。