Hibernate Search 中文操作指南

15. Searching

除了简单的索引之外,Hibernate Search 还公开了高级 API,可用于在不必使用本机 API 的情况下搜索这些索引。

Beyond simply indexing, Hibernate Search also exposes high-level APIs to search these indexes without having to resort to native APIs.

这些搜索 API 的一项主要特性是能够使用索引来执行搜索,但返回从数据库加载的实体,有效地为 Hibernate ORM 实体提供了一种新型查询。

One key feature of these search APIs is the ability to use indexes to perform the search, but to return entities loaded from the database, effectively offering a new type of query for Hibernate ORM entities.

15.1. Query DSL

15.1.1. Basics

准备和执行查询只需要几行代码:

Preparing and executing a query requires just a few lines:

示例 160. 执行搜索查询

. Example 160. Executing a search query

// Not shown: open a transaction if relevant
SearchSession searchSession = /* ... */ (1)

SearchResult<Book> result = searchSession.search( Book.class ) (2)
        .where( f -> f.match() (3)
                .field( "title" )
                .matching( "robot" ) )
        .fetch( 20 ); (4)

long totalHitCount = result.total().hitCount(); (5)
List<Book> hits = result.hits(); (6)
// Not shown: commit the transaction if relevant

这种方法配合 Hibernate ORM integration 将能正常工作:默认情况下,搜索查询的命中内容将由 Hibernate ORM entities 管理,受用于创建搜索会话的实体管理器约束。这提供了 Hibernate ORM 的所有优点,尤其是根据需要检索关联实体来浏览实体图的能力。

This will work fine with the Hibernate ORM integration: by default, the hits of a search query will be entities managed by Hibernate ORM, bound to the entity manager used to create the search session. This provides all the benefits of Hibernate ORM, in particular the ability to navigate the entity graph to retrieve associated entities if necessary.

对于 Standalone POJO Mapper ,上面的代码片段默认情况下将失败。

For the Standalone POJO Mapper, the snippet above will fail by default.

您将需要:

You will need to either:

configure target entity types to enable loading,如果您想从外部数据源加载实体。

configure target entity types to enable loading, if you want to load entities from an external datasource.

要从索引的内容中重建实体,可以在目标实体类型中添加一个 projection constructor

add a projection constructor to target entity types, if you want to reconstruct entities from the content of the index.

相反,使用显式的 projections 检索来自索引的特定数据。

use explicit projections to retrieve specific data from the index instead.

查询 DSL 提供了许多功能,在以下部分中有详细说明。一些常用的功能包括:

The query DSL offers many features, detailed in the following sections. Some commonly used features include:

  1. predicates, the main component of a search query, i.e. the condition that every document must satisfy in order to be included in search results.

  2. fetching the results differently: getting the hits directly as a list, using pagination, scrolling, etc.

  3. sorts, to order the hits in various ways: by score, by the value of a field, by distance to a point, etc.

  4. projections, to retrieve hits that are not just managed entities: data can be extracted from the index (field values), or even from both the index and the database.

  5. aggregations, to group hits and compute aggregated metrics for each group — hit count by category, for example.

15.1.2. Advanced entity types targeting

Targeting multiple entity types

当多个实体类型具有类似的索引字段时,可以在单个搜索查询中跨这些多个类型进行搜索:搜索结果将包含来自任何目标类型的命中。

When multiple entity types have similar indexed fields, it is possible to search across these multiple types in a single search query: the search result will contain hits from any of the targeted types.

示例 161. 在单个搜索查询中指定多个实体类型

. Example 161. Targeting multiple entity types in a single search query

SearchResult<Person> result = searchSession.search( Arrays.asList( (1)
        Manager.class, Associate.class
) )
        .where( f -> f.match() (2)
                .field( "name" )
                .matching( "james" ) )
        .fetch( 20 ); (3)

多实体(多索引)搜索只能在所有目标索引中的谓词/排序等中引用的字段相同的情况下才能正常运行(相同类型,相同分析器,……)。仅在其中一个目标索引中定义的字段也能正常工作。

Multi-entity (multi-index) searches will only work well as long as the fields referenced in predicates/sorts/etc. are identical in all targeted indexes (same type, same analyzer, …​). Fields that are defined in only one of the targeted indexes will also work correctly.

如果您要引用在其中一个目标索引中稍有不同的索引字段(不同的类型,不同的分析器,……),请参见 Targeting multiple fields

If you want to reference index fields that are even slightly different in one of the targeted indexes (different type, different analyzer, …​), see Targeting multiple fields.

Targeting entity types by name

虽然很少有必要使用它,但它也可以用 entity names 代替类来指定搜索针对的 entity types

Though rarely necessary, it is also possible to use entity names instead of classes to designate the entity types targeted by the search:

示例 162. 按名称指定实体类型

. Example 162. Targeting entity types by name

SearchResult<Person> result = searchSession.search( (1)
        searchSession.scope( (2)
                Person.class,
                Arrays.asList( "Manager", "Associate" )
        )
)
        .where( f -> f.match() (3)
                .field( "name" )
                .matching( "james" ) )
        .fetch( 20 ); (4)

15.1.3. Fetching results

Basics

在 Hibernate Search 中,默认搜索结果比“命中列表”复杂一点。这就是默认方法返回一个复合 SearchResult 对象的原因,该对象提供 getter 以检索您想要的结果部分,如下面的示例所示。

In Hibernate Search, the default search result is a bit more complicated than just "a list of hits". This is why the default methods return a composite SearchResult object offering getters to retrieve the part of the result you want, as shown in the example below.

示例 163. 从 SearchResult 中获取信息

. Example 163. Getting information from a SearchResult

SearchResult<Book> result = searchSession.search( Book.class ) (1)
        .where( f -> f.matchAll() )
        .fetch( 20 ); (2)

long totalHitCount = result.total().hitCount(); (3)
List<Book> hits = result.hits(); (4)
// ... (5)

仅仅检索命中总数是可能的,对于只关心命中的数量而不是命中本身的情况:

It is possible to retrieve the total hit count alone, for cases where only the number of hits is of interest, not the hits themselves:

示例 164. 直接获取命中总数

. Example 164. Getting the total hit count directly

long totalHitCount = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetchTotalHitCount();

也可以直接获取前几名命中,而无需通过 SearchResult,如果只有前几名命中有用,而不需要命中总数,这会很方便:

The top hits can also be obtained directly, without going through a SearchResult, which can be handy if only the top hits are useful, and not the total hit count:

示例 165. 直接获取最高命中

. Example 165. Getting the top hits directly

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

如果只期待零到一个命中,可以将其检索为 Optional。如果返回多个命中,将抛出异常。

If only zero to one hit is expected, it is possible to retrieve it as an Optional. An exception will be thrown if more than one hits are returned.

示例 166. 直接获取唯一命中

. Example 166. Getting the only hit directly

Optional<Book> hit = searchSession.search( Book.class )
        .where( f -> f.id().matching( 1 ) )
        .fetchSingleHit();
Fetching all hits

获取所有命中内容很少是个好主意:如果查询匹配许多实体/文档,这可能会导致在内存中加载数百万个实体,这可能会导致 JVM 崩溃,或者至少会大幅降低其速度。

Fetching all hits is rarely a good idea: if the query matches many entities/documents, this may lead to loading millions of entities in memory, which will likely crash the JVM, or at the very least slow it down to a crawl.

如果您知道您的查询总是有少于 N 次命中,请考虑将限制设置为 N 以避免出现内存问题。

If you know your query will always have less than N hits, consider setting the limit to N to avoid memory issues.

如果没有命中次数的上限,您应该考虑使用 Pagination Scrolling 以批量检索数据。

If there is no bound to the number of hits you expect, you should consider Pagination or Scrolling to retrieve data in batches.

如果您仍然想在一次调用中获取所有命中内容,请注意 Elasticsearch 后端由于 Elasticsearch 群集中的内部安全机制,每次只会返回 10,000 次命中内容。

If you still want to fetch all hits in one call, be aware that the Elasticsearch backend will only ever return 10,000 hits at a time, due to internal safety mechanisms in the Elasticsearch cluster.

示例 167. 在 SearchResult 中获取所有命中内容

. Example 167. Getting all hits in a SearchResult

SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.id().matchingAny( Arrays.asList( 1, 2 ) ) )
        .fetchAll();

long totalHitCount = result.total().hitCount();
List<Book> hits = result.hits();
示例 168. 直接获取所有命中内容

. Example 168. Getting all hits directly

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.id().matchingAny( Arrays.asList( 1, 2 ) ) )
        .fetchAllHits();
Fetching the total (hit count, …​)

SearchResultTotal 包含与该查询匹配的所有命中的命中计数,无论命中是否属于当前页面。有关分页,请参阅 Pagination

A SearchResultTotal contains the count of the total hits have been matched the query, either belonging to the current page or not. For pagination see Pagination.

默认情况下,命中总数是准确的,但在以下情况下可以用低边界估计值替换:

The total hit count is exact by default, but can be replaced with a lower-bound estimate in the following cases:

  1. The totalHitCountThreshold option is enabled. See totalHitCountThreshold(…​): optimizing total hit count computation.

  2. The truncateAfter option is enabled and a timeout occurs.

示例 169. 使用结果总数

. Example 169. Working with the result total

SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetch( 20 );

SearchResultTotal resultTotal = result.total(); (1)
long totalHitCount = resultTotal.hitCount(); (2)
long totalHitCountLowerBound = resultTotal.hitCountLowerBound(); (3)
boolean hitCountExact = resultTotal.isHitCountExact(); (4)
boolean hitCountLowerBound = resultTotal.isHitCountLowerBound(); (5)
totalHitCountThreshold(…​): optimizing total hit count computation

当处理大型结果集时,准确对命中的数量进行计数会消耗许多资源。

When working with large result sets, counting the number of hits exactly can be very resource-consuming.

按得分(默认)排序并通过 fetch(…​) 检索结果时,可以通过允许 Hibernate Search 返回总命中的低边界估计值,而不是准确的总命中次数,从而显着提高性能。在这种情况下,底层引擎(Lucene 或 Elasticsearch)将能够跳过大量非竞争性的命中,从而减少索引扫描,进而提高性能。

When sorting by score (the default) and retrieving the result through fetch(…​), it is possible to yield significant performance improvements by allowing Hibernate Search to return a lower-bound estimate of the total hit count, instead of the exact total hit count. In that case, the underlying engine (Lucene or Elasticsearch) will be able to skip large chunks of non-competitive hits, leading to fewer index scans and thus better performance.

要启用此性能优化,请在构建查询时调用 totalHitCountThreshold(…​),如下例所示。

To enable this performance optimization, call totalHitCountThreshold(…​) when building the query, as shown in the example below.

此优化在以下情况下不会产生影响:

This optimization has no effect in the following cases:

调用 fetchHits(…​) 时:默认情况下已对其进行优化。

when calling fetchHits(…​): it is already optimized by default.

调用 fetchTotalHitCount() 时:它会始终返回确切的 hit 计数。

when calling fetchTotalHitCount(): it always returns an exact hit count.

使用 Elasticsearch 后端调用 scroll(…​) 时:在滚动时,Elasticsearch 不支持此优化。但是,对于使用 Lucene 后端的 scroll(…​) 调用,已启用优化。

when calling scroll(…​) with the Elasticsearch backend: Elasticsearch does not support this optimization when scrolling. The optimization is enabled for scroll(…​) calls with the Lucene backend, however.

示例 170. 定义命中总数阈值

. Example 170. Defining a total hit count threshold

SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .totalHitCountThreshold( 1000 ) (1)
        .fetch( 20 );

SearchResultTotal resultTotal = result.total(); (2)
long totalHitCountLowerBound = resultTotal.hitCountLowerBound(); (3)
boolean hitCountExact = resultTotal.isHitCountExact(); (4)
boolean hitCountLowerBound = resultTotal.isHitCountLowerBound(); (5)
Pagination

分页的概念是将命中拆分成连续的“页面”,所有页面都包含固定数量的元素(最后一个页面除外)。在网页上显示结果时,用户将能够转到任意页面并查看相应的结果,例如“14265 个中的 151 到 170 个结果”。

Pagination is the concept of splitting hits in successive "pages", all pages containing a fixed number of elements (except potentially the last one). When displaying results on a web page, the user will be able to go to an arbitrary page and see the corresponding results, for example "results 151 to 170 of 14,265".

分页在 Hibernate Search 中通过将偏移量和限制传递给 fetchfetchHits 方法来实现:

Pagination is achieved in Hibernate Search by passing an offset and a limit to the fetch or fetchHits method:

  1. The offset defines the number of documents that should be skipped because they were displayed in previous pages. It is a number of documents, not a number of pages, so you will usually want to compute it from the page number and page size this way: offset = zero-based-page-number * page-size.

  2. The limit defines the maximum number of hits to return, i.e. the page size.

示例 171. 分页检索 SearchResult

. Example 171. Pagination retrieving a SearchResult

SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetch( 40, 20 ); (1)
示例 172. 分页直接检索命中

. Example 172. Pagination retrieving hits directly

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetchHits( 40, 20 ); (1)

在检索两个页面之间索引可能会被修改。由于该修改,一些命中可能会改变位置,并最终出现在两个后续页面中。

The index may be modified between the retrieval of two pages. As a result of that modification, it is possible that some hits change position, and end up being present on two subsequent pages.

如果正在运行批处理并且希望避免这种情况,请使用 Scrolling

If you’re running a batch process and want to avoid this, use Scrolling.

Scrolling

滚动是将光标保持在最低级别的搜索查询上并逐渐推进该光标的概念,以收集后续“块”的搜索命中。

Scrolling is the concept of keeping a cursor on the search query at the lowest level, and advancing that cursor progressively to collect subsequent "chunks" of search hits.

滚动依赖于游标的内部状态(需要在某个时刻关闭),因此不适合无状态操作,例如在网页中向用户显示结果页面。但是,由于这种内部状态,滚动能够确保所有返回的命中都是一致的:绝对不会出现给定的命中两次的情况。

Scrolling relies on the internal state of the cursor (which must be closed at some point), and thus is not appropriate for stateless operations such as displaying a page of results to a user in a webpage. However, thanks to this internal state, scrolling is able to guarantee that all returned hits are consistent: there is absolutely no way for a given hit to appear twice.

因此,在将大型结果集处理成小块时,滚动最有帮助。

Scrolling is therefore most useful when processing a large result set as small chunks.

下面是使用 Hibernate Search 中的滚动的示例。

Below is an example of using scrolling in Hibernate Search.

对于 Elasticsearch 后端,滚动可能会超时并在一段时间后变得不可用;有关更多信息,请参阅 here

With the Elasticsearch backend, scrolls can time out and become unusable after some time; See here for more information.

示例 173. 滚动以小块检索搜索结果

. Example 173. Scrolling to retrieve search results in small chunks

try ( SearchScroll<Book> scroll = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .scroll( 20 ) ) { (1)
    for ( SearchScrollResult<Book> chunk = scroll.next(); (2)
            chunk.hasHits(); chunk = scroll.next() ) { (3)
        for ( Book hit : chunk.hits() ) { (4)
            // ... do something with the hits ...
        }

        totalHitCount = chunk.total().hitCount(); (5)

        entityManager.flush(); (6)
        entityManager.clear(); (6)
    }
}

15.1.4. Routing

有关分片的初步介绍,包括它在 Hibernate Search 中的工作方式以及它的局限性是什么,请参阅 Sharding and routing

For a preliminary introduction to sharding, including how it works in Hibernate Search and what its limitations are, see Sharding and routing.

如果对于给定的索引,有一个不可变的值是文档经常过滤的,例如“类别”或“用户 ID”,可以使用路由密钥(而不是谓词)与该值匹配文档。

If, for a given index, there is one immutable value that documents are often filtered on, for example a "category" or a "user id", it is possible to match documents with this value using a routing key instead of a predicate.

路由键的主要优点在于,除了过滤文档之外,路由键还可以过滤 shards。如果启用了分片,这意味着在查询执行期间,只会扫描索引的一部分,从而有可能提高搜索性能。

The main advantage of a routing key over a predicate is that, on top of filtering documents, the routing key will also filter shards. If sharding is enabled, this means only part of the index will be scanned during query execution, potentially increasing search performance.

在搜索查询中使用路由的前提条件是以这样的方式映射您的实体,即在索引时 it is assigned a routing key

A pre-requisite to using routing in search queries is to map your entity in such a way that it is assigned a routing key at indexing time.

通过在构建查询时调用 .routing(String).routing(Collection<String>) 方法来指定路由密钥:

Specifying routing keys is done by calling the .routing(String) or .routing(Collection<String>) methods when building the query:

示例 174. 将查询路由到所有分片的子集

. Example 174. Routing a query to a subset of all shards

SearchResult<Book> result = searchSession.search( Book.class ) (1)
        .where( f -> f.match()
                .field( "genre" )
                .matching( Genre.SCIENCE_FICTION ) ) (2)
        .routing( Genre.SCIENCE_FICTION.name() ) (3)
        .fetch( 20 ); (4)

15.1.5. Entity loading options for Hibernate ORM

当使用 Hibernate ORM 映射器时,Hibernate Search 执行数据库查询以加载作为搜索查询命中的一部分返回的实体。

When using the Hibernate ORM mapper, Hibernate Search executes database queries to load entities that are returned as part of the hits of a search query.

此部分介绍搜索查询中与实体加载相关的所有可用选项。

This section presents all available options related to entity loading in search queries.

Cache lookup strategy

此功能仅可通过 Hibernate ORM integration 使用。

This feature is only available with the Hibernate ORM integration.

尤其不能与 Standalone POJO Mapper 一起使用。

It cannot be used with the Standalone POJO Mapper in particular.

默认情况下,Hibernate Search 会直接从数据库加载实体,而不查看任何缓存。当缓存(Hibernate ORM 会话或二级缓存)的 size 远低于编入索引的实体总数时,这是一个好策略。

By default, Hibernate Search will load entities from the database directly, without looking at any cache. This is a good strategy when the size of caches (Hibernate ORM session or second level cache) is much lower than the total number of indexed entities.

如果其中一部分实体存在于二级缓存中,则可以强制 Hibernate Search 从持久性上下文中(会话)和/或二级缓存中(如果可能)检索实体。Hibernate Search 仍需要执行数据库查询以检索缓存中缺失的实体,但是查询可能需要获取的实体较少,从而提高性能并减少开销数据库。

If a significant portion of your entities are present in the second level cache, you can force Hibernate Search to retrieve entities from the persistence context (the session) and/or the second level cache if possible. Hibernate Search will still need to execute a database query to retrieve entities missing from the cache, but the query will likely have to fetch fewer entities, leading to better performance and lower stress on your database.

这是通过缓存查找策略完成的,可以通过设置配置属性 hibernate.search.query.loading.cache_lookup.strategy

This is done through the cache lookup strategy, which can be configured by setting the configuration property hibernate.search.query.loading.cache_lookup.strategy:

  1. skip (the default) will not perform any cache lookup.

  2. persistence-context will only look into the persistence context, i.e. will check if the entities are already loaded in the session. Useful if most search hits are expected to already be loaded in session, which is generally unlikely.

  3. persistence-context-then-second-level-cache will first look into the persistence context, then into the second level cache, if enabled in Hibernate ORM for the searched entity. Useful if most search hits are expected to be cached, which may be likely if you have a small number of entities and a large cache.

在二级缓存可用于给定实体类型之前,需要在 Hibernate ORM 中进行一些配置。

Before a second-level cache can be used for a given entity type, some configuration is required in Hibernate ORM.

如以下所示,还可以在每个查询的基础上覆盖配置的策略。

It is also possible to override the configured strategy on a per-query basis, as shown below.

示例 175. 在单个搜索查询中覆盖缓存查找策略

. Example 175. Overriding the cache lookup strategy in a single search query

SearchResult<Book> result = searchSession.search( Book.class ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .loading( o -> o.cacheLookupStrategy( (2)
                EntityLoadingCacheLookupStrategy.PERSISTENCE_CONTEXT_THEN_SECOND_LEVEL_CACHE
        ) )
        .fetch( 20 ); (3)
Fetch size

此功能仅可通过 Hibernate ORM integration 使用。

This feature is only available with the Hibernate ORM integration.

尤其不能与 Standalone POJO Mapper 一起使用。

It cannot be used with the Standalone POJO Mapper in particular.

默认情况下,Hibernate Search 会使用 100 的获取大小,这意味着对于单个查询上的单个 fetch*() 调用,它会运行一个第一个查询来加载首批 100 个实体,然后如果有更多命中,它会运行第二个查询来加载下一个 100 个实体,依此类推。

By default, Hibernate Search will use a fetch size of 100, meaning that for a single fetch*() call on a single query, it will run a first query to load the first 100 entities, then if there are more hits it will run a second query to load the next 100, etc.

可以通过设置配置属性 hibernate.search.query.loading.fetch_size 来配置提取大小。此属性需要一个绝对值 Integer value

The fetch size can be configured by setting the configuration property hibernate.search.query.loading.fetch_size. This property expects a strictly positive Integer value.

还可以在每个查询的基础上覆盖配置的获取大小,如下所示。

It is also possible to override the configured fetch size on a per-query basis, as shown below.

示例 176. 在单个搜索查询中覆盖抓取大小

. Example 176. Overriding the fetch size in a single search query

SearchResult<Book> result = searchSession.search( Book.class ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .loading( o -> o.fetchSize( 50 ) ) (2)
        .fetch( 200 ); (3)
Entity graph

此功能仅可通过 Hibernate ORM integration 使用。

This feature is only available with the Hibernate ORM integration.

尤其不能与 Standalone POJO Mapper 一起使用。

It cannot be used with the Standalone POJO Mapper in particular.

默认情况下,Hibernate Search 会根据映射的默认值加载关联:标记为延迟的关联将不会加载,而标记为立即的关联将在返回实体之前加载。

By default, Hibernate Search will load associations according to the defaults of your mappings: associations marked as lazy won’t be loaded, while associations marked as eager will be loaded before returning the entities.

可以通过在查询中引用实体图来强制加载延迟关联或防止加载立即关联。请参见下文的示例以及 this section of the Hibernate ORM documentation 了解有关实体图的更多信息。

It is possible to force the loading of a lazy association, or to prevent the loading of an eager association, by referencing an entity graph in the query. See below for an example, and this section of the Hibernate ORM documentation for more information about entity graphs.

示例 177. 将实体图应用于搜索查询

. Example 177. Applying an entity graph to a search query

EntityManager entityManager = /* ... */

EntityGraph<Manager> graph = entityManager.createEntityGraph( Manager.class ); (1)
graph.addAttributeNodes( "associates" );

SearchResult<Manager> result = Search.session( entityManager ).search( Manager.class ) (2)
        .where( f -> f.match()
                .field( "name" )
                .matching( "james" ) )
        .loading( o -> o.graph( graph, GraphSemantic.FETCH ) ) (3)
        .fetch( 20 ); (4)

除了当场构建实体图之外,你还可以使用 @NamedEntityGraph 注释静态定义实体图,并将图的名称传递给 Hibernate Search,如下所示。请参阅 this section of the Hibernate ORM documentation 了解有关 @NamedEntityGraph 的更多信息。

Instead of building the entity graph on the spot, you can also define the entity graph statically using the @NamedEntityGraph annotation, and pass the name of your graph to Hibernate Search, as shown below. See this section of the Hibernate ORM documentation for more information about @NamedEntityGraph.

示例 178. 将命名的实体图应用于搜索查询

. Example 178. Applying a named entity graph to a search query

SearchResult<Manager> result = Search.session( entityManager ).search( Manager.class ) (1)
        .where( f -> f.match()
                .field( "name" )
                .matching( "james" ) )
        .loading( o -> o.graph( "preload-associates", GraphSemantic.FETCH ) ) (2)
        .fetch( 20 ); (3)

15.1.6. Timeout

您可以通过两种方式限制搜索查询执行所需的时间:

You can limit the time it takes for a search query to execute in two ways:

  1. Aborting (throwing an exception) when the time limit is reached with failAfter().

  2. Truncating the results when the time limit is reached with truncateAfter().[.iokays-translated-bd4e2a190797bc49f72bc917a39da457]

当前,两种方法不兼容:尝试同时设置 failAftertruncateAfter 将导致未指定的行为。

Currently, the two approaches are incompatible: trying to set both failAfter and truncateAfter will result in unspecified behavior.

failAfter(): Aborting the query after a given amount of time

在构建查询时通过调用 failAfter(…​),可以为查询执行设置时间限制。达到时间限制后,Hibernate Search 会停止查询执行并引发 SearchTimeoutException

By calling failAfter(…​) when building the query, it is possible to set a time limit for the query execution. Once the time limit is reached, Hibernate Search will stop the query execution and throw a SearchTimeoutException.

超时在尽力执行的基础上进行处理。

Timeouts are handled on a best-effort basis.

根据内部时钟的分辨率和 Hibernate Search 检查时钟的频率,查询执行可能会超过超时时间。Hibernate Search 将尝试最小化此多余执行时间。

Depending on the resolution of the internal clock and on how often Hibernate Search is able to check that clock, it is possible that a query execution exceeds the timeout. Hibernate Search will try to minimize this excess execution time.

示例 179. 在超时时触发故障

. Example 179. Triggering a failure on timeout

            try {
                SearchResult<Book> result = searchSession.search( Book.class ) (1)
                        .where( f -> f.match()
                                .field( "title" )
                                .matching( "robot" ) )
                        .failAfter( 500, TimeUnit.MILLISECONDS ) (2)
                        .fetch( 20 ); (3)
            }
            catch (SearchTimeoutException e) { (4)
                // ...
            }

explain() 不遵守此超时时间:此方法用于调试目的,尤其用于找出查询为何缓慢的原因。

explain() does not honor this timeout: this method is used for debugging purposes and in particular to find out why a query is slow.

truncateAfter(): Truncating the results after a given amount of time

在构建查询时通过调用 truncateAfter(…​),可以为搜索结果的收集设置时间限制。达到时间限制后,Hibernate Search 会停止收集命中并返回不完整的结果。

By calling truncateAfter(…​) when building the query, it is possible to set a time limit for the collection of search results. Once the time limit is reached, Hibernate Search will stop collecting hits and return an incomplete result.

超时在尽力执行的基础上进行处理。

Timeouts are handled on a best-effort basis.

根据内部时钟的分辨率和 Hibernate Search 检查时钟的频率,查询执行可能会超过超时时间。Hibernate Search 将尝试最小化此多余执行时间。

Depending on the resolution of the internal clock and on how often Hibernate Search is able to check that clock, it is possible that a query execution exceeds the timeout. Hibernate Search will try to minimize this excess execution time.

  1. مثال رقم 180. اقتطاع النتائج عند انتهاء المهلة

. Example 180. Truncating the results on timeout

            SearchResult<Book> result = searchSession.search( Book.class ) (1)
                    .where( f -> f.match()
                            .field( "title" )
                            .matching( "robot" ) )
                    .truncateAfter( 500, TimeUnit.MILLISECONDS ) (2)
                    .fetch( 20 ); (3)

            Duration took = result.took(); (4)
            Boolean timedOut = result.timedOut(); (5)

explain()fetchTotalHitCount() 不遵守此超时时间。前者用于调试目的,尤其用于找出查询为何缓慢的原因。对于后者,返回 partial 结果没有意义。

explain() and fetchTotalHitCount() do not honor this timeout. The former is used for debugging purposes and in particular to find out why a query is slow. For the latter it does not make sense to return a partial result.

15.1.7. Setting query parameters

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

有些查询元素可能会利用查询参数。在查询级别调用 .param(..) 来设置它们:

Some query elements may leverage the use of query parameters. Call .param(..) at the query level to set these:

  1. مثال رقم 181. تعيين معلمات الاستعلام

. Example 181. Setting query parameters

List<Manager> managers = searchSession.search( Manager.class )
        .where(
                //...
        )
        .param( "param1", "name" )
        .param( "param2", 10 )
        .param( "param3", LocalDate.of( 2002, 02, 20 ) )
        .fetchAllHits();

另请参阅:

See also:

15.1.8. Obtaining a query object

本文档中的大多数示例都在查询定义 DSL 结束时直接获取查询结果,没有显示任何可操作的“查询”对象。这是因为查询对象通常只会让代码更冗长,而不会带来任何有价值的东西。

The example presented in most of this documentation fetch the query results directly at the end of the query definition DSL, not showing any "query" object that can be manipulated. This is because the query object generally only makes code more verbose without bringing anything worthwhile.

但是,在某些情况下,查询对象可能很有用。要获取查询对象,只需在查询定义的末尾处调用 toQuery()

However, in some cases a query object can be useful. To get a query object, just call toQuery() at the end of the query definition:

  1. مثال رقم 182. الحصول على كائن SearchQuery

. Example 182. Getting a SearchQuery object

SearchQuery<Book> query = searchSession.search( Book.class ) (1)
        .where( f -> f.matchAll() )
        .toQuery(); (2)
List<Book> hits = query.fetchHits( 20 ); (3)

يدعم كائن الاستعلام هذا جميع كائنات fetch* methods supported by the query DSL . الميزة الرئيسية لهذا الأمر بدلاً من استدعاء هذه الأساليب مباشرة في نهاية تعريف الاستعلام تتعلق بشكل رئيسي بـ troubleshooting ، إلا أن كائن الاستعلام يمكن أن يكون مفيدًا أيضًا إذا كنت بحاجة إلى محول إلى واجهة برمجة تطبيقات أخرى.

This query object supports all fetch* methods supported by the query DSL. The main advantage over calling these methods directly at the end of a query definition is mostly related to troubleshooting, but the query object can also be useful if you need an adapter to another API.

Hibernate Search 提供一个 JPA 和 Hibernate ORM 原生 API 的适配器,也就是说,可以通过 SearchQuery 转换成 javax.persistence.TypedQuery(JPA)或 org.hibernate.query.Query(原生 ORM API):

Hibernate Search provides an adapter to JPA and Hibernate ORM’s native APIs, i.e. a way to turn a SearchQuery into a javax.persistence.TypedQuery (JPA) or a org.hibernate.query.Query (native ORM API):

  1. مثال رقم 183. تحويل SearchQuery إلى استعلام يعتمد على كائن وسيط لجافا أو على هيبرنيت

. Example 183. Turning a SearchQuery into a JPA or Hibernate ORM query

SearchQuery<Book> query = searchSession.search( Book.class ) (1)
        .where( f -> f.matchAll() )
        .toQuery(); (2)
jakarta.persistence.TypedQuery<Book> jpaQuery = Search.toJpaQuery( query ); (3)
org.hibernate.query.Query<Book> ormQuery = Search.toOrmQuery( query ); (4)

لا يدعم الاستعلام الناتج جميع العمليات، لذلك يوصى بتحويل استعلامات البحث فقط عند الحاجة الماسة لذلك، على سبيل المثال عند الاندماج مع التعليمات البرمجية التي تعمل فقط مع استعلامات هيبرنيت.

The resulting query does not support all operations, so it is recommended to only convert search queries when absolutely required, for example when integrating with code that only works with Hibernate ORM queries.

ومن المتوقع أن تعمل العمليات التالية بشكل صحيح في معظم الحالات، حتى وإن كانت تتصرف بشكل مختلف قليلاً عما هو متوقع من جافا TypedQuery أو هيبرنيت ORM Query في بعض الحالات (بما في ذلك، على سبيل المثال لا الحصر، نوع الاستثناءات المُلقاة):

The following operations are expected to work correctly in most cases, even though they may behave slightly differently from what is expected from a JPA TypedQuery or Hibernate ORM Query in some cases (including, but not limited to, the type of thrown exceptions):

直接命中检索方法: list, getResultList, uniqueResult,…​

Direct hit retrieval methods: list, getResultList, uniqueResult, …​

滚动: scroll(), scroll(ScrollMode) (但仅限于 ScrollMode.FORWARDS_ONLY)。

Scrolling: scroll(), scroll(ScrollMode) (but only with ScrollMode.FORWARDS_ONLY).

setFirstResult / setMaxResults ومنشئات الاستدعاء.

setFirstResult/setMaxResults and getters.

setFetchSize

setFetchSize

unwrap

unwrap

setHint

setHint

من المعروف أن العمليات التالية لا تعمل بشكل صحيح، ولا توجد خطة لإصلاحها في الوقت الحالي:

The following operations are known not to work correctly, with no plan to fix them at the moment:

getHints

getHints

أساليب ذات صلة بالمعلمة: setParameter ، …​

Parameter-related methods: setParameter, …​

结果转换器: setResultTransformer, …​ 改用 composite projections

Result transformer: setResultTransformer, …​ Use composite projections instead.

أساليب ذات صلة بالقفل: setLockOptions ، …​

Lock-related methods: setLockOptions, …​

等等:此列表并非详尽无遗。

And more: this list is not exhaustive.

15.1.9. explain(…​): Explaining scores

من أجل explain the score لوثيقة معينة، يجب استخدام create a SearchQuery object مع toQuery() في نهاية تعريف الاستعلام، ومن ثم استخدام إحدى الأساليب الخاصة بالجزء الخلفي explain(…​) ؛ ستتضمن نتيجة هذه الأساليب وصفًا يمكن قراءته من قبل البشر حول كيفية حساب درجة وثيقة معينة.

In order to explain the score of a particular document, create a SearchQuery object using toQuery() at the end of the query definition, and then use one of the backend-specific explain(…​) methods; the result of these methods will include a human-readable description of how the score of a specific document was computed.

无论使用哪种 API,解释在性能方面相当昂贵:仅将其用于调试目的。

Regardless of the API used, explanations are rather costly performance-wise: only use them for debugging purposes.

  1. مثال رقم 184. استرداد شرح الدرجة - Lucen

. Example 184. Retrieving score explanation — Lucene

LuceneSearchQuery<Book> query = searchSession.search( Book.class )
        .extension( LuceneExtension.get() ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .toQuery(); (2)

Explanation explanation1 = query.explain( 1 ); (3)
Explanation explanation2 = query.explain( "Book", 1 ); (4)

LuceneSearchQuery<Book> luceneQuery = query.extension( LuceneExtension.get() ); (5)
  1. مثال رقم 185. استرداد شرح الدرجة - Elasticsearch

. Example 185. Retrieving score explanation — Elasticsearch

ElasticsearchSearchQuery<Book> query = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .toQuery(); (2)

JsonObject explanation1 = query.explain( 1 ); (3)
JsonObject explanation2 = query.explain( "Book", 1 ); (4)

ElasticsearchSearchQuery<Book> elasticsearchQuery = query.extension( ElasticsearchExtension.get() ); (5)

15.1.10. took and timedOut: finding out how long the query took

  1. مثال رقم 186. إرجاع وقت تنفيذ الاستعلام وما إذا حدثت مهلة زمنية

. Example 186. Returning query execution time and whether a timeout occurred

SearchQuery<Book> query = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .toQuery();

SearchResult<Book> result = query.fetch( 20 ); (1)

Duration took = result.took(); (2)
Boolean timedOut = result.timedOut(); (3)

15.1.11. Elasticsearch: leveraging advanced features with JSON manipulation

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

Elasticsearch 带有许多功能。在某个时间点,您需要的某些功能可能不会由搜索 DSL 公开。

Elasticsearch ships with many features. It is possible that at some point, one feature you need will not be exposed by the Search DSL.

为了规避这些限制,Hibernate Search 提供了一些方法:

To work around such limitations, Hibernate Search provides ways to:

  1. Transform the HTTP request sent to Elasticsearch for search queries.

  2. Read the raw JSON of the HTTP response received from Elasticsearch for search queries.[.iokays-translated-13def2e686b2dd91543db014f3237d79]

针对 HTTP 请求的直接更改可能与 Hibernate Search 功能冲突,并且不同版本的 Elasticsearch 可能提供不同的支持。

Direct changes to the HTTP request may conflict with Hibernate Search features and be supported differently by different versions of Elasticsearch.

同样,HTTP 响应的内容可能会根据 Elasticsearch 版本、所用 Hibernate Search 功能以及甚至 Hibernate Search 功能的实现方式的不同而更改。

Similarly, the content of the HTTP response may change depending on the version of Elasticsearch, depending on which Hibernate Search features are used, and even depending on how Hibernate Search features are implemented.

因此,依靠直接访问 HTTP 请求或响应的功能并不能保证在升级 Hibernate Search 时继续工作,即使是小升级也一样(从 x.y.zx.y.(z+1) )。

Thus, features relying on direct access to HTTP requests or responses cannot be guaranteed to continue to work when upgrading Hibernate Search, even for micro upgrades (x.y.z to x.y.(z+1)).

自行承担风险。

Use this at your own risk.

如以下所示,大多数简单用例只需要略微更改 HTTP 请求。

Most simple use cases will only need to change the HTTP request slightly, as shown below.

示例 187. 在搜索查询中手动转换 Elasticsearch 请求

. Example 187. Transforming the Elasticsearch request manually in a search query

List<Book> hits = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .requestTransformer( context -> { (2)
            Map<String, String> parameters = context.parametersMap(); (3)
            parameters.put( "search_type", "dfs_query_then_fetch" );

            JsonObject body = context.body(); (4)
            body.addProperty( "min_score", 0.5f );
        } )
        .fetchHits( 20 ); (5)

对于更复杂的使用案例,可以通过访问 HTTP 响应的原始 JSON,如下所示。

For more complicated use cases, it is possible to access the raw JSON of the HTTP response, as shown below.

示例 188. 在搜索查询中手动访问 Elasticsearch 响应主体

. Example 188. Accessing the Elasticsearch response body manually in a search query

ElasticsearchSearchResult<Book> result = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robt" ) )
        .requestTransformer( context -> { (2)
            JsonObject body = context.body();
            body.add( "suggest", jsonObject( suggest -> { (3)
                suggest.add( "my-suggest", jsonObject( mySuggest -> {
                    mySuggest.addProperty( "text", "robt" );
                    mySuggest.add( "term", jsonObject( term -> {
                        term.addProperty( "field", "title" );
                    } ) );
                } ) );
            } ) );
        } )
        .fetch( 20 ); (4)

JsonObject responseBody = result.responseBody(); (5)
JsonArray mySuggestResults = responseBody.getAsJsonObject( "suggest" ) (6)
        .getAsJsonArray( "my-suggest" );

Gson 构建 JSON 对象的 API 非常繁琐,因此上述示例依赖于一个小型的自定义助手方法以提高代码的可读性:

Gson’s API for building JSON objects is quite verbose, so the example above relies on a small, custom helper method to make the code more readable:

private static JsonObject jsonObject(Consumer<JsonObject> instructions) { JsonObject object = new JsonObject(); instructions.accept( object ); return object; } private static JsonObject jsonObject(Consumer<JsonObject> instructions) { JsonObject object = new JsonObject(); instructions.accept( object ); return object; }

private static JsonObject jsonObject(Consumer<JsonObject> instructions) { JsonObject object = new JsonObject(); instructions.accept( object ); return object; } private static JsonObject jsonObject(Consumer<JsonObject> instructions) { JsonObject object = new JsonObject(); instructions.accept( object ); return object; }

当需要从每个命中中提取数据时,通常使用 jsonHit projection 比解析整个响应更方便。

When data needs to be extracted from each hit, it is often more convenient to use the jsonHit projection than parsing the whole response.

15.1.12. Lucene: retrieving low-level components

Lucene 查询允许检索一些低级别组件。这仅对集成人员有用,但是为了完整性,在此记录下来。

Lucene queries allow to retrieve some low-level components. This should only be useful to integrators, but is documented here for the sake of completeness.

示例 189. 在 Lucene 搜索查询中访问底层组件

. Example 189. Accessing low-level components in a Lucene search query

LuceneSearchQuery<Book> query = searchSession.search( Book.class )
        .extension( LuceneExtension.get() ) (1)
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" ) )
        .sort( f -> f.field( "title_sort" ) )
        .toQuery(); (2)

Sort sort = query.luceneSort(); (3)

LuceneSearchResult<Book> result = query.fetch( 20 ); (4)

TopDocs topDocs = result.topDocs(); (5)

15.2. Predicate DSL

15.2.1. Basics

搜索查询的主要组件是 predicate,也就是说,每个文档必须满足的条件才能包含在搜索结果中。

The main component of a search query is the predicate, i.e. the condition that every document must satisfy in order to be included in search results.

在构建搜索查询时配置谓词:

The predicate is configured when building the search query:

示例 190. 定义搜索查询的谓词

. Example 190. Defining the predicate of a search query

SearchSession searchSession = /* ... */ (1)

List<Book> result = searchSession.search( Book.class ) (2)
        .where( f -> f.match().field( "title" ) (3)
                .matching( "robot" ) )
        .fetchHits( 20 ); (4)

或者,如果您不想使用 lambdas:

Alternatively, if you don’t want to use lambdas:

示例 191. 定义搜索查询的谓词 - 基于对象的语法

. Example 191. Defining the predicate of a search query — object-based syntax

SearchSession searchSession = /* ... */

SearchScope<Book> scope = searchSession.scope( Book.class );

List<Book> result = searchSession.search( scope )
        .where( scope.predicate().match().field( "title" )
                .matching( "robot" )
                .toPredicate() )
        .fetchHits( 20 );

谓词 DSL 提供更多谓词类型以及每种谓词类型的多种选项。要了解有关 match 谓词和所有其他类型谓词的更多信息,请参阅以下各节。

The predicate DSL offers more predicate types, and multiple options for each type of predicate. To learn more about the match predicate, and all the other types of predicate, refer to the following sections.

15.2.2. matchAll: match all documents

matchAll 谓词仅匹配所有文档。

The matchAll predicate simply matches all documents.

示例 192. 匹配所有文档

. Example 192. Matching all documents

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
except(…​): exclude documents matching a given predicate

选择性地,您可以从命中项中排除一些文档:

Optionally, you can exclude a few documents from the hits:

示例 193. 匹配所有文档,但匹配给定谓词的文档除外

. Example 193. Matching all documents except those matching a given predicate

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll()
                .except( f.match().field( "title" )
                        .matching( "robot" ) )
        )
        .fetchHits( 20 );
Other options
  1. The score of a matchAll predicate is constant and equal to 1 by default, but can be boosted with .boost(…​).

15.2.3. matchNone: match no documents

matchNone 谓词是 matchAll 的逆,并且不匹配任何文档。

The matchNone predicate is the inverse of matchAll and matches no documents.

示例 194. 不匹配任何文档

. Example 194. Matching no documents

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchNone() )
        .fetchHits( 20 );

15.2.4. id: match a document identifier

id 谓词通过其标识符匹配文档。

The id predicate matches documents by their identifier.

示例 195. 匹配具有给定标识符的文档

. Example 195. Matching a document with a given identifier

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.id().matching( 1 ) )
        .fetchHits( 20 );

您还可以在一个谓词中匹配多个 id:

You can also match multiple ids in a single predicate:

示例 196. 匹配给定集合中具有标识符的所有文档

. Example 196. Matching all documents with an identifier among a given collection

List<Integer> ids = new ArrayList<>();
ids.add( 1 );
ids.add( 2 );
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.id().matchingAny( ids ) )
        .fetchHits( 20 );
Expected type of arguments

默认情况下,id 谓词预期的 matching(…​)/matchingAny(…​) 方法的参数与对应于文档 id 的实体属性类型相同。

By default, the id predicate expects arguments to the matching(…​)/matchingAny(…​) methods to have the same type as the entity property corresponding to the document id.

例如,如果文档标识符是从类型为 Long 的实体标识符生成的,则文档标识符仍将是 String 类型。无论如何,matching(…​)/matchingAny(…​) 会希望其参数类型为 Long

For example, if the document identifier is generated from an entity identifier of type Long, the document identifier will still be of type String. matching(…​)/matchingAny(…​) will expect its argument to be of type Long regardless.

这通常是您想要的,但如果您需要绕过转换并向 matching(…​)/matchingAny(…​) 传递未转换的参数(如 String 类型),请参阅 Type of arguments passed to the DSL

This should generally be what you want, but if you ever need to bypass conversion and pass an unconverted argument (of type String) to matching(…​)/matchingAny(…​), see Type of arguments passed to the DSL.

Other options
  1. The score of an id predicate is constant and equal to 1 by default, but can be boosted with .boost(…​).

15.2.5. match: match a value

match 谓词匹配指定字段具有给定值的文档。

The match predicate matches documents for which a given field has a given value.

示例 197. 匹配一个值

. Example 197. Matching a value

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match().field( "title" )
                .matching( "robot" ) )
        .fetchHits( 20 );
Expected type of arguments

默认情况下,match 谓词预期的 matching(…​) 方法的参数与对应于目标字段的实体属性类型相同。

By default, the match predicate expects arguments to the matching(…​) method to have the same type as the entity property corresponding to the target field.

例如,如果实体属性是枚举类型, the corresponding field may be of type String.matching(…​) 将始终期望其参数具有枚举类型。

For example, if an entity property is of an enum type, the corresponding field may be of type String. .matching(…​) will expect its argument to have the enum type regardless.

这通常是您想要的,但如果您需要绕过转换并向 .matching(…​) 传递未转换的参数(在上述示例中,类型为 String),请参阅 Type of arguments passed to the DSL

This should generally be what you want, but if you ever need to bypass conversion and pass an unconverted argument (of type String in the example above) to .matching(…​), see Type of arguments passed to the DSL.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Analysis

对于多数字段类型(数字、日期等),匹配是精确的。但是,对于 full-text 字段或 normalized keyword 字段,在将传给 matching(…​) 方法的值与索引中的值进行比较之前会对其进行分析或规范化。这意味着匹配在两种方式上更为微妙。

For most field types (number, date, …​), the match is exact. However, for full-text fields or normalized keyword fields, the value passed to the matching(…​) method is analyzed or normalized before being compared to the values in the index. This means the match is more subtle in two ways.

首先,该谓词不只会匹配给定字段具有完全相同值的文档:它将匹配所有该字段的值具有经过标准化后的形式相同的值的文档。请参阅下文以了解示例。

First, the predicate will not just match documents for which a given field has the exact same value: it will match all documents for which this field has a value whose normalized form is identical. See below for an example.

示例 198. 匹配规范化词条

. Example 198. Matching normalized terms

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.match().field( "lastName" )
                .matching( "ASIMOV" ) )(1)
        .fetchHits( 20 );

assertThat( hits ).extracting( Author::getLastName )
        .contains( "Asimov" );(2)

其次,对于 full-text 字段,传给 matching(…​) 方法的值会被标记化。这意味着可能会从输入值中提取多项,并且谓词将匹配给定字段的值在任何地方和任何顺序下 contains any of those terms 的所有文档。有关示例,请参见下方内容。

Second, for full-text fields, the value passed to the matching(…​) method is tokenized. This means multiple terms may be extracted from the input value, and the predicate will match all documents for which the given field has a value that contains any of those terms, at any place and in any order. See below for an example.

示例 199. 匹配多个词条

. Example 199. Matching multiple terms

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match().field( "title" )
                .matching( "ROBOT Dawn" ) ) (1)
        .fetchHits( 20 );

assertThat( hits ).extracting( Book::getTitle )
        .contains( "The Robots of Dawn", "I, Robot" ); (2)

匹配多个术语或匹配更相关的术语的命中次数会有更高的 score。因此,如果按分数排序,最相关的命中次数将显示在结果列表的顶部。这通常可以弥补谓词不用求匹配到文档中存在的 all 术语这一事实。

Hits matching multiple terms, or matching more relevant terms, will have a higher score. Thus, if you sort by score, the most relevant hits will appear to the top of the result list. This usually makes up for the fact that the predicate does not require all terms to be present in matched documents.

如果您需要 all 词条出现在匹配的文档中,应该能够通过使用 simpleQueryString 谓词来执行此操作,尤其是通过定义 default operator 的能力来做到这一点。只需确保定义希望对用户显示的 syntax features

If you need all terms to be present in matched documents, you should be able to do so by using the simpleQueryString predicate, in particular its ability to define a default operator. Just make sure to define which syntax features you want to expose to your users.

fuzzy: match a text value approximately

.fuzzy() 选项允许近似匹配,即它允许匹配给定字段的值不完全等于传递给 matching(…​) 的值但很接近的文档,例如一个字母被另一个字母代替。

The .fuzzy() option allows for approximate matches, i.e. it allows matching documents for which a given field has a value that is not exactly the value passed to matching(…​), but a close value, for example with one letter that was switched for another.

. Example 200. Matching a text value approximately

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "robto" )
                .fuzzy() )
        .fetchHits( 20 );

粗略地说,编辑距离是两个术语之间的变化数量:字符切换、删除等。模糊匹配启用时,它默认为 2,但也可以将其设置为 0(禁用模糊匹配)或 1(只允许一个更改,因此“较不模糊”)。不允许高于 2 的值。

Roughly speaking, the edit distance is the number of changes between two terms: switching characters, removing them, …​ It defaults to 2 when fuzzy matching is enabled, but can also be set to 0 (fuzzy matching disabled) or 1 (only one change allowed, so "less fuzzy"). Values higher than 2 are not allowed.

示例 201. 用显式编辑距离近似匹配文本值

. Example 201. Matching a text value approximately with explicit edit distance

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "robto" )
                .fuzzy( 1 ) )
        .fetchHits( 20 );

您还可以强制匹配前 n 个字符。n 称为“精确前缀长度”。出于性能原因,对于包含大量不同术语的索引,建议将其设置为非零值。

Optionally, you can force the match to be exact for the first n characters. n is called the "exact-prefix length". Setting this to a non-zero value is recommended for indexes containing a large amount of distinct terms, for performance reasons.

示例 202. 用精确前缀长度近似匹配文本值

. Example 202. Matching a text value approximately with exact prefix length

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "robto" )
                .fuzzy( 1, 3 ) )
        .fetchHits( 20 );
minimumShouldMatch: fine-tuning how many terms are required to match

要求匹配字符串中的任意数量术语存在于文档中以便 match 谓词匹配。这就是 minimumShouldMatch* 方法的目的,如下所示。

It is possible to require that an arbitrary number of terms from the match string to be present in the document in order for the match predicate to match. This is the purpose of the minimumShouldMatch* methods, as demonstrated below.

示例 203. 利用 minimumShouldMatch 微调匹配要求

. Example 203. Fine-tuning matching requirements with minimumShouldMatch

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "investigation detective automatic" )
                .minimumShouldMatchNumber( 2 ) ) (1)
        .fetchHits( 20 ); (2)
Other options
  1. The score of a match predicate is variable for text fields by default, but can be made constant with .constantScore().

  2. The score of a match predicate can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

  3. The match predicate uses the search analyzer of targeted fields to analyze searched text by default, but this can be overridden.

15.2.6. range: match a range of values

range 谓词匹配给定字段在其给定范围内具有值的文档。

The range predicate matches documents for which a given field has a value within a given range.

示例 204. 匹配给定值范围

. Example 204. Matching a range of values

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.range().field( "pageCount" )
                .between( 210, 250 ) )
        .fetchHits( 20 );

between 方法包含两个边界,即值恰好等于任一边界且 range 谓词匹配的文档。

The between method includes both bounds, i.e. documents whose value is exactly one of the bounds will match the range predicate.

必须提供至少一个边界。如果一个边界是 null,它不会约束匹配。例如 .between( 2, null ) 将匹配所有等于或高于 2 的值。

At least one bound must be provided. If a bound is null, it will not constrain matches. For example .between( 2, null ) will match all values higher than or equal to 2.

可以调用其他方法来代替 between 以控制下限和上限的包含:

Different methods can be called instead of between in order to control the inclusion of the lower and upper bound:

atLeast

示例 205. 匹配等于或大于给定值的值_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .atLeast( 400 ) ) .fetchHits( 20 );_

Example 205. Matching values equal to or greater than a given value_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .atLeast( 400 ) ) .fetchHits( 20 );_

greaterThan

示例 206. 匹配严格大于给定值的值_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .greaterThan( 400 ) ) .fetchHits( 20 );_

Example 206. Matching values strictly greater than a given value_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .greaterThan( 400 ) ) .fetchHits( 20 );_

atMost

示例 207. 匹配等于或小于给定值的值_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .atMost( 400 ) ) .fetchHits( 20 );_

Example 207. Matching values equal to or less than a given value_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .atMost( 400 ) ) .fetchHits( 20 );_

lessThan

示例 208. 匹配严格小于给定值的值_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .lessThan( 400 ) ) .fetchHits( 20 );_

Example 208. Matching values strictly less than a given value_List<Book> hits = searchSession.search( Book.class ) .where( f → f.range().field( "pageCount" ) .lessThan( 400 ) ) .fetchHits( 20 );_

此外,您可以明确指定边界是否包括在内或排除在外:

Alternatively, you can specify whether bounds are included or excluded explicitly:

示例 209. 匹配具有明确的限定包含/排除的给定值范围

. Example 209. Matching a range of values with explicit bound inclusion/exclusion

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.range().field( "pageCount" )
                .between(
                        200, RangeBoundInclusion.EXCLUDED,
                        250, RangeBoundInclusion.EXCLUDED
                ) )
        .fetchHits( 20 );

有时可能需要匹配位于某一范围内的值。虽然可以创建一个 or predicate ,并为每个范围添加一个 range predicate ,但有一种更简单的方法来做到这一点:

Sometimes it may be needed to match the value that is in one of the ranges. While it is possible to create an or predicate and add a range predicate for each of the ranges, there is a simpler way to do that:

示例 210. 匹配位于任何提供的范围内值

. Example 210. Matching values that are within any of the provided ranges

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.range().field( "pageCount" )
                .withinAny(
                        Range.between( 200, 250 ),
                        Range.between( 500, 800 )
                ) )
        .fetchHits( 20 );
Expected type of arguments

默认情况下,range 谓词期望 between(…​)/atLeast(…​)/等方法中的参数具有与目标字段对应的实体属性相同的类型。

By default, the range predicate expects arguments to the between(…​)/atLeast(…​)/etc. method to have the same type as the entity property corresponding to the target field.

例如,如果实体属性为 java.util.Date 类型,则 the corresponding field may be of type java.time.Instant ; between(…​) / atLeast(…​) / 等将始终期望其参数为 java.util.Date 类型。类似地, range(…​) 将期望类型为 Range<java.util.Date> 的参数。

For example, if an entity property is of type java.util.Date, the corresponding field may be of type java.time.Instant; between(…​)/atLeast(…​)/etc. will expect its arguments to have type java.util.Date regardless. Similarly, range(…​) will expect an argument of type Range<java.util.Date>.

这通常是您想要的,但如果您需要绕过转换并向 between(…​)/atLeast(…​)/等传递未转换的参数(在上述示例中,类型为 java.time.Instant),请参阅 Type of arguments passed to the DSL

This should generally be what you want, but if you ever need to bypass conversion and pass an unconverted argument (of type java.time.Instant in the example above) to between(…​)/atLeast(…​)/etc., see Type of arguments passed to the DSL.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a range predicate is constant and equal to 1 by default, but can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .between(…​)/atLeast(…​)/etc.

15.2.7. phrase: match a sequence of words

phrase 谓词匹配文档,其中给定字段包含按给定顺序排列的给定单词序列。

The phrase predicate matches documents for which a given field contains a given sequence of words, in the given order.

. Example 211. Matching a sequence of words

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.phrase().field( "title" )
                .matching( "robots of dawn" ) )
        .fetchHits( 20 );
slop: match a sequence of words approximately

指定 slop 允许近似匹配,即允许匹配给定字段包含给定单词序列但顺序略有不同的文档,或包含额外单词的文档。

Specifying a slop allows for approximate matches, i.e. it allows matching documents for which a given field contains the given sequence of words, but in a slightly different order, or with extra words.

松弛度表示可应用于单词序列以进行匹配的编辑操作数,其中每个编辑操作将一个单词移动一个位置。因此,松弛度为 1quick fox 可以变成 quick <word> fox,其中 <word> 可以是任何单词。松弛度为 2quick fox 可以变成 quick <word> fox,或 quick <word1> <word2> fox 甚至 fox quick(两个操作:将 fox 向左移动,将 quick 向右移动)。对于更高的松弛度和包含更多单词的短语,以此类推。

The slop represents the number of edit operations that can be applied to the sequence of words to match, where each edit operation moves one word by one position. So quick fox with a slop of 1 can become quick <word> fox, where <word> can be any word. quick fox with a slop of 2 can become quick <word> fox, or quick <word1> <word2> fox or even fox quick (two operations: moved fox to the left and quick to the right). And similarly for higher slops and for phrases with more words.

示例 212. 使用 slop(…​) 近似匹配一系列单词

. Example 212. Matching a sequence of words approximately with slop(…​)

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.phrase().field( "title" )
                .matching( "dawn robot" )
                .slop( 3 ) )
        .fetchHits( 20 );
Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a phrase predicate is variable by default, but can be made constant with .constantScore().

  2. The score of a phrase predicate can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

  3. The phrase predicate uses the search analyzer of targeted fields to analyze searched text by default, but this can be overridden.

15.2.8. exists: match fields with content

exists 谓词匹配给定字段具有非空值的文档。

The exists predicate matches documents for which a given field has a non-null value.

示例 213. 匹配字段与内容

. Example 213. Matching fields with content

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.exists().field( "comment" ) )
        .fetchHits( 20 );

没有现成谓词可以匹配给定字段为 null 的文档,但你可以通过否定 exists 谓词来轻松创建自己的谓词。

There isn’t any built-in predicate to match documents for which a given field is null, but you can easily create one yourself by negating an exists predicate.

这可以通过将 exists 谓词传递到 not predicate 或者在 except clause in a matchAll predicate 中使用来实现。

This can be achieved by passing in an exists predicate to a not predicate, or by using it in an except clause in a matchAll predicate.

Object fields

_exists_谓词也可以应用于对象字段。在这种情况下,它将匹配给定对象字段的至少一个内部字段具有非空值的任何文档。

The exists predicate can also be applied to an object field. In that case, it will match all documents for which at least one inner field of the given object field has a non-null value.

示例 214. 匹配对象字段与内容

. Example 214. Matching object fields with content

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.exists().field( "placeOfBirth" ) )
        .fetchHits( 20 );

对象字段要被认为“存在”,必须至少有一个带有内容的内部字段。

Object fields need to have at least one inner field with content in order to be considered as "existing".

让我们考虑上面的示例,假设 placeOfBirth 对象字段只有一个内部字段: placeOfBirth.country

Let’s consider the example above, and let’s assume the placeOfBirth object field only has one inner field: placeOfBirth.country:

_placeOfBirth_为 null 的作者不会匹配.

an author whose placeOfBirth is null will not match.

_placeOfBirth_不为 null 且已填写 _country_的作者将匹配.

an author whose placeOfBirth is not null and has the country filled in will match.

placeOfBirth 不为 null 但 country 没有填入的作者将不会匹配。

an author whose placeOfBirth is not null but does not have the country filled in will not match.

因此,最好在已知至少有一个内部字段永远不会为 null 的对象字段上使用 exists 谓词:标识符、名称……

Because of this, it is preferable to use the exists predicate on object fields that are known to have at least one inner field that is never null: an identifier, a name, …​

Other options
  1. The score of an exists predicate is constant and equal to 1 by default, but can be boosted with a call to .boost(…​).

15.2.9. wildcard: match a simple pattern

wildcard 谓词匹配给定字段包含与给定模式相匹配的单词的文档。

The wildcard predicate matches documents for which a given field contains a word matching the given pattern.

示例 215. 匹配简单模式

. Example 215. Matching a simple pattern

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.wildcard().field( "description" )
                .matching( "rob*t" ) )
        .fetchHits( 20 );

模式可能包含以下字符:

The pattern may include the following characters:

  1. * matches zero, one or multiple characters.

  2. ? matches zero or one character.

  3. \ escape the following character, e.g. \? is interpreted as a literal ?, \\ as a literal \, etc.

  4. any other character is interpreted as a literal.[.iokays-translated-10a9ba5beae159ca8d7cabf1f7f69655]

如果字段上已经定义了正规化器,则在通配符谓词中使用的模式会进行正规化。

If a normalizer has been defined on the field, the patterns used in wildcard predicates will be normalized.

如果在字段上已经定义了分析器:

If an analyzer has been defined on the field:

使用 Elasticsearch 后端时,模式不会被分析或正规化,并且会与单个索引标记匹配,而不是一系列标记。在旧版本的基础搜索引擎(例如 Elasticsearch 7.7-7.11 或 OpenSearch 2.5 之前)上,行为可能不同(例如,通配符模式会进行正规化)。因此,请参考特定版本的文档以了解确切的行为。

when using the Elasticsearch backend, the patterns won’t be analyzed nor normalized, and will be expected to match a single indexed token, not a sequence of tokens. This may behave differently on an older versions of the underlying search engine (for example with Elasticsearch 7.7-7.11 or OpenSearch prior to 2.5 the wildcard pattern will get normalized). Hence, please refer to the documentation of your particular version for the exact behaviour.

使用 Lucene 后端时,模式会进行正规化,但不会进行标记化:模式仍将与单个索引标记匹配,而不是一系列标记。

when using the Lucene backend the patterns will be normalized, but not tokenized: the pattern will still be expected to match a single indexed token, not a sequence of tokens.

例如,在对一个字段进行索引时应用小写过滤器的正规化器时,模式 Cat* 可能会匹配 cat

For example, a pattern such as Cat* could match cat when targeting a field having a normalizer that applies a lowercase filter when indexing.

模式 john gr* 在针对以空格进行标记化的字段时不会匹配任何内容。 cat 可能会匹配,因为它不包含任何空格。

A pattern such as john gr* will not match anything when targeting a field that tokenizes on spaces. gr* may match, since it doesn’t include any space.

当目标是匹配用户提供的查询字符串时,应该首选 simple query string predicate

When the goal is to match user-provided query strings, the simple query string predicate should be preferred.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a wildcard predicate is constant and equal to 1 by default, but can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

15.2.10. regexp: match a regular expression pattern

regexp 谓词匹配给定字段包含与给定正则表达式相匹配的单词的文档。

The regexp predicate matches documents for which a given field contains a word matching the given regular expression.

示例 216. 匹配正则表达式模式

. Example 216. Matching a regular expression pattern

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.regexp().field( "description" )
                .matching( "r.*t" ) )
        .fetchHits( 20 );
Regexp predicates and analysis

regexp 谓词在被分析/归一化的字段上的行为有点复杂,因此这里总结了它是如何工作的。

The behavior of regexp predicates on analyzed/normalized fields is a bit complex, so here is a summary of how it works.

Regexps must match the entirety of analyzed/normalized tokens

如果某个字段是小写,且按照空格进行标记化(使用分析器),则正则表达式 robots? :

For a field that is lowercased and tokenized on spaces (using an analyzer), the regexp robots?:

将匹配 Robot:索引的令牌 _robot_匹配.

will match Robot: the indexed token robot matches.

将匹配 I love robots:索引的令牌 _robots_匹配.

will match I love robots: the indexed token robots matches.

不会匹配 Mr. Roboto : 索引标记 roboto 不匹配。

will not match Mr. Roboto: the indexed token roboto does not match.

对于使用正规化器(而不是标记器)进行小写处理但未进行标记化的字段,正则表达式 robots?

For a field that is lowercased but not tokenized (using a normalizer), the regexp robots?:

将匹配 Robot:索引的令牌 _robot_匹配.

will match Robot: the indexed token robot matches.

不会匹配 I love robots : 索引标记 i love robots 不匹配。

will not match I love robots: the indexed token i love robots does not match.

不会匹配 Mr. Roboto : 索引标记 mr. roboto 不匹配。

will not match Mr. Roboto: the indexed token mr. roboto does not match.

Regexps are never tokenized, even if fields are

尤其是要小心正则表达式中的空格。

Beware of spaces in regexps, in particular.

对于在空格上进行标记化的字段(使用分析器),正则表达式 .love . robots? 永远不会匹配任何内容,因为它要求在令牌内部有空格,而经过索引的令牌不包含任何空格(因为标记化是在空格上进行的)。

For a field that is tokenized on spaces (using an analyzer), the regexp .love . robots? will never match anything, because it requires a space inside the token and indexed tokens don’t contain any (since tokenization happens on spaces).

对于使用正规化器(而不是标记器)进行小写处理但未进行标记化的字段,正则表达式 .love . robots?

For a field that is lowercased, but not tokenized (using a normalizer), the regexp .love . robots?:

将匹配 I love robots,该令牌已索引为 i love robots.

will match I love robots, which was indexed as i love robots.

将匹配 I love my Robot,该令牌已索引为 i love my robot.

will match I love my Robot, which was indexed as i love my robot.

不会匹配 I love Mr. Roboto (已索引为 i love mr. roboto ): roboto 不匹配 robots?

will not match I love Mr. Roboto, which was indexed as i love mr. roboto: roboto doesn’t match robots?.

With the Lucene backend, regexps are never analyzed nor normalized

如果某个字段是小写,且按照空格进行标记化:

For a field that is lowercased and tokenized on spaces:

正则表达式 Robots? 不会被标准化,且永远不会匹配任何内容,因为它需要一个大写字母,而索引标记不包含任何大写字母(因为它们是小写)。

the regexp Robots? will not be normalized and will never match anything, because it requires an uppercase letter and indexed tokens don’t contain any (since they are lowercased).

正则表达式 [Rr]obots? 不会被标准化,但会匹配 I love Robots :索引标记 robots 匹配。

the regexp [Rr]obots? will not be normalized but will match I love Robots: the indexed token robots matches.

正则表达式 love .* robots? 不会被标准化,且与 I love my RobotI love robots 相匹配,但与 Robots love me 不匹配。

the regexp love .* robots? will not be normalized and will match I love my Robot as well as I love robots, but not Robots love me.

With the Elasticsearch backend, regexps are not analyzed nor normalized on text (tokenized) fields, but are normalized on keyword (non-tokenized) fields

如果某个字段是小写,且按照空格进行标记化(使用分析器):

For a field that is lowercased and tokenized on spaces (using an analyzer):

正则表达式 Robots? 不会被标准化,且永远不会匹配任何内容,因为它需要一个大写字母,而索引标记不包含任何大写字母(因为它们是小写)。

the regexp Robots? will not be normalized and will never match anything, because it requires an uppercase letter and indexed tokens don’t contain any (since they are lowercased).

正则表达式 [Rr]obots? 不会被标准化,但会匹配 I love Robots :索引标记 robots 匹配。

the regexp [Rr]obots? will not be normalized but will match I love Robots: the indexed token robots matches.

正则表达式 love .* robots? 不会被标准化,且与 I love my RobotI love robots 相匹配,但与 Robots love me 不匹配。

the regexp love .* robots? will not be normalized and will match I love my Robot as well as I love robots, but not Robots love me.

但是,与标准化字段中的 Lucene 相比,行为有所不同!对于小写字段(没有使用标准化器进行标记化):

However, behavior differs from Lucene for normalized fields! For a field that is lowercased, but not tokenized (using a normalizer):

regexp Robots?+_将被标准化为 _robots?,并将匹配 I love robots:索引的令牌 _robots_匹配.

the regexp Robots?+ will be normalized to robots? and will match I love robots: the indexed token robots matches.

regexp [Rr]obots?+_将被标准化为 _[rr]obots?,并将匹配 I love Robots:索引的令牌 _robots_匹配.

the regexp [Rr]obots?+ will be normalized to [rr]obots? and will match I love Robots: the indexed token robots matches.

正则表达式 love .* robots?I love my RobotI love robots 相匹配,但与 Robots love me 不匹配。

the regexp love .* robots? will match I love my Robot as well as I love robots, but not Robots love me.

由于 Elasticsearch 正则表达式是标准化的,因此标准化器会干扰正则表达式元字符并完全更改正则表达式的含义。

As a result of Elasticsearch normalizing regular expressions, normalizers can interfere with regexp meta-characters and completely change the meaning of a regexp.

例如,对于一个用 _, the regexp _Robots? 替换字符 *? 的字段,它会被标准化为 Robots ,且可能不会匹配任何内容。

For example, for a field whose normalizer replaces the characters * and ? with _, the regexp _Robots? will be normalized to Robots and will probably never match anything.

此行为被认为是一个 bug,并且 was reported to the Elasticsearch project

This behavior is considered a bug and was reported to the Elasticsearch project.

flags: enabling only specific syntax constructs

默认情况下,Hibernate Search 不会启用任何可选操作符。若要启用其中的某些操作符,可以指定 flags 属性。

By default, Hibernate Search does not enable any optional operators. To enable some of them, it is possible to specify the flags attribute.

示例 217. 匹配帶標記的正則表達式模式

. Example 217. Matching a regular expression pattern with flags

hits = searchSession.search( Book.class )
        .where( f -> f.regexp().field( "description" )
                .matching( "r@t" )
                .flags( RegexpQueryFlag.ANY_STRING )
        )
        .fetchHits( 20 );

以下标志/运算符可用:

The following flags/operators are available:

  1. INTERVAL: the <> operator matches a non-negative integer range, both ends included.

例如,a<1-10> 匹配 a1a2、…​ a9a10,但不匹配 a11

For example a<1-10> matches a1, a2, …​ a9, a10, but not a11.

前导零是有意义的,例如 a<01-10> 匹配 a01a02,但不匹配 a1a2

Leading zeroes are meaningful, e.g. a<01-10> matches a01 and a02 but not a1 nor a2.

  1. INTERSECTION: the & operator combines two regexps with an AND operator.

例如,.a.&.*z.* 匹配 azzababzbbzbab,但不匹配 az

For example .a.&.*z.* matches az, za, babzb, bzbab, but not a nor z.

  1. ANYSTRING: the @ operator matches any string; equivalent to .*.

此运算符主要用于否定模式,例如,@&~(ab) 匹配除字符串 ab 之外的任何内容。

This operator is mostly useful to negate a pattern, e.g. @&~(ab) matches anything except the string ab.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a regexp predicate is constant and equal to 1 by default, but can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

15.2.11. terms: match a set of terms

terms 谓词匹配给定字段包含某些项的文档,这些项可以是任何项或所有项。

The terms predicate matches documents for which a given field contains some terms, any or all of them.

使用 matchingAny ,我们要求所提供的术语中至少有一个匹配。从功能上讲,这有点类似于具有每个术语一个 match 谓词的 boolean OR ,但单个 terms 谓词的语法更简洁。

With matchingAny we require that at least one of the provided terms matches. Functionally, this is somewhat similar to a boolean OR with one match predicate per term, but the syntax for a single terms predicate is more concise.

. Example 218. Matching any of the provided terms

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.terms().field( "genre" )
                .matchingAny( Genre.CRIME_FICTION, Genre.SCIENCE_FICTION ) )
        .fetchHits( 20 );

使用 matchingAll ,我们要求所有提供的术语匹配。从功能上讲,这有点类似于具有每个术语一个 match 谓词的 boolean AND ,但单个 terms 谓词的语法更简洁。

With matchingAll we require that all the provided terms match. Functionally, this is somewhat similar to a boolean AND with one match predicate per term, but the syntax for a single terms predicate is more concise.

默认情况下, matchingAll 不接受超过 1024 个术语。

By default, matchingAll will not accept more than 1024 terms.

可以通过特定于后端的配置提高此限制:

It is possible to raise this limit through backend-specific configuration:

对于 Lucene 后端,在启动应用程序时运行此代码: org.apache.lucene.search.BooleanQuery.maxClauseCount = <your limit>;

For the Lucene backend, run this code when starting up your application: org.apache.lucene.search.BooleanQuery.maxClauseCount = <your limit>;

但是,请记住该限制是有原因的:尝试匹配数量非常大的术语将表现不佳,并可能导致崩溃。

However, keep in mind the limit is there for a reason: attempts to match very large numbers of terms will perform poorly and could lead to crashes.

示例 219. 匹配提供的所有术语

. Example 219. Matching all the provided terms

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.terms().field( "genre" )
                .matchingAny( Genre.CRIME_FICTION, Genre.SCIENCE_FICTION ) )
        .fetchHits( 20 );
terms predicates and analysis

与其他谓词不同,传递给 matchingAny()matchingAll() 的术语永远不会被分析,通常也不会被规范化。

Differently from other predicates, the terms passed to matchingAny() or matchingAll() are never analyzed nor usually normalized.

如果在字段中定义了分析器,则术语将不会被分析或规范化。

If an analyzer has been defined on the field, the terms will not be analyzed nor normalized.

如果在字段中定义了规范器:

If a normalizer has been defined on the field:

当使用 Elasticsearch 后端时,术语将被规范化。

when using the Elasticsearch backend, the terms will be normalized.

当使用 Lucene 后端时,术语将不会被规范化。

when using the Lucene backend, the terms will not be normalized.

例如,当针对具有在索引时应用小写过滤器功能的规范器的字段时,术语 Cat 可以匹配 cat ,但仅当使用 Elasticsearch 后端时。当使用 Lucene 后端时,只有 cat 才能匹配 cat

For example, the term Cat could match cat when targeting a field having a normalizer that applies a lowercase filter when indexing, but only when using the Elasticsearch backend. When using the Lucene backend, only the term cat could match cat.

Expected type of arguments

默认情况下,_terms_谓词期望_matchingAny(…​)_或_matchingAll(…​)_方法的参数具有与字段目标对应的实体属性相同的类型。

By default, the terms predicate expects arguments to the matchingAny(…​) or matchingAll(…​) methods to have the same type as the entity property corresponding to the target field.

例如,如果实体属性是枚举类型, the corresponding field may be of type String . .matchingAny(…​) 将始终希望其参数具有枚举类型。

For example, if an entity property is of an enum type, the corresponding field may be of type String. .matchingAny(…​) will expect its argument to have the enum type regardless.

这通常应该是你所需要的,但是如果你需要绕过转换并将未转换的参数(在上面的示例中,其类型为 String)传递给 .matchingAny(…​).matchingAll(…​),请参阅 Type of arguments passed to the DSL

This should generally be what you want, but if you ever need to bypass conversion and pass an unconverted argument (of type String in the example above) to .matchingAny(…​) or .matchingAll(…​), see Type of arguments passed to the DSL.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a terms predicate is constant and equal to 1 by default, but can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matchingAny(…​) or .matchingAll(…​).

15.2.12. and: match all clauses

and 谓词匹配与其所有内部谓词(称为“子句”)匹配的文档。

The and predicate matches documents that match all of its inner predicates, called "clauses".

在进行 score 计算期间会考虑匹配“和”子句。

Matching "and" clauses are taken into account during score computation.

示例 220.匹配符合所有给定谓词的文档 (~ AND 运算符)

. Example 220. Matching a document that matches all the multiple given predicates (~AND operator)

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.and(
                f.match().field( "title" )
                        .matching( "robot" ), (1)
                f.match().field( "description" )
                        .matching( "crime" ) (2)
        ) )
        .fetchHits( 20 ); (3)
Adding clauses dynamically with the lambda syntax

可以将_and_谓词定义在 lambda 表达式中。这在一些情况下非常有用,例如,基于用户输入,需要动态地将子句添加到_and_谓词。

It is possible to define the and predicate inside a lambda expression. This is especially useful when clauses need to be added dynamically to the and predicate, for example based on user input.

示例 221.使用 .where(…​) 和 lambda 语法动态添加从句

. Example 221. Easily adding clauses dynamically using .where(…​) and the lambda syntax

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( (f, root) -> { (2)
            root.add( f.matchAll() ); (3)
            if ( searchParameters.getGenreFilter() != null ) { (4)
                root.add( f.match().field( "genre" )
                        .matching( searchParameters.getGenreFilter() ) );
            }
            if ( searchParameters.getFullTextFilter() != null ) {
                root.add( f.match().fields( "title", "description" )
                        .matching( searchParameters.getFullTextFilter() ) );
            }
            if ( searchParameters.getPageCountMaxFilter() != null ) {
                root.add( f.range().field( "pageCount" )
                        .atMost( searchParameters.getPageCountMaxFilter() ) );
            }
        } )
        .fetchHits( 20 );

如果_and_谓词不是根谓词,另一种依赖于_with(…​)_方法的语法可能会派上用场:

Another syntax relying on the method with(…​) can be useful when the and predicate is not the root predicate:

示例 222.使用 with(…​) 和 lambda 语法动态添加从句

. Example 222. Easily adding clauses dynamically using with(…​) and the lambda syntax

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.and().with( and -> { (2)
            and.add( f.matchAll() ); (3)
            if ( searchParameters.getGenreFilter() != null ) { (4)
                and.add( f.match().field( "genre" )
                        .matching( searchParameters.getGenreFilter() ) );
            }
            if ( searchParameters.getFullTextFilter() != null ) {
                and.add( f.match().fields( "title", "description" )
                        .matching( searchParameters.getFullTextFilter() ) );
            }
            if ( searchParameters.getPageCountMaxFilter() != null ) {
                and.add( f.range().field( "pageCount" )
                        .atMost( searchParameters.getPageCountMaxFilter() ) );
            }
        } ) )
        .fetchHits( 20 );
Options
  1. The score of an and predicate is variable by default, but can be made constant with .constantScore().

  2. The score of an and predicate can be boosted with a call to .boost(…​).

15.2.13. or: match any clause

or 谓词匹配其任何内部谓词(称为“从句”)的文档。

The or predicate matches documents that match any of its inner predicates, called "clauses".

在进行 score 计算期间会考虑匹配 or 子句。

Matching or clauses are taken into account during score computation.

示例 223.匹配符合给定多个谓词中的任何一个的文档 (~ OR 运算符)

. Example 223. Matching a document that matches any of multiple given predicates (~OR operator)

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.or(
                f.match().field( "title" )
                        .matching( "robot" ), (1)
                f.match().field( "description" )
                        .matching( "investigation" ) (2)
        ) )
        .fetchHits( 20 ); (3)
Adding clauses dynamically with the lambda syntax

可以将_or_谓词定义在 lambda 表达式中。这在一些情况下非常有用,例如,基于用户输入,需要动态地将子句添加到_or_谓词。

It is possible to define the or predicate inside a lambda expression. This is especially useful when clauses need to be added dynamically to the or predicate, for example based on user input.

示例 224.使用 with(…​) 和 lambda 语法动态添加从句

. Example 224. Easily adding clauses dynamically using with(…​) and the lambda syntax

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.or().with( or -> { (2)
            if ( !searchParameters.getAuthorFilters().isEmpty() ) {
                for ( String authorFilter : searchParameters.getAuthorFilters() ) { (3)
                    or.add( f.match().fields( "authors.firstName", "authors.lastName" )
                            .matching( authorFilter ) );
                }
            }
        } ) )
        .fetchHits( 20 );
Options
  1. The score of an or predicate is variable by default, but can be made constant with .constantScore().

  2. The score of an or predicate can be boosted with a call to .boost(…​).

15.2.14. not: negating another predicate

_not_谓词匹配未匹配给定谓词的记录。

The not predicate matches documents that are not matched by a given predicate.

示例 225.否定 match 谓词

. Example 225. Negating a match predicate

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.not(
                f.match()
                        .field( "genre" )
                        .matching( Genre.SCIENCE_FICTION )
        ) )
        .fetchHits( 20 );
Other options
  1. The score of a not predicate is constant and equal to 0 by default, but if boosted with .boost(…​) the default would be changed to 1 and corresponding boost will be applied.

15.2.15. bool: advanced combinations of predicates (or/and/…​)

bool 谓词允许以比更简单的 and / or 谓词更复杂的方式组合内部谓词。

The bool predicate allows combining inner predicates in a more complex fashion than with simpler and/or predicates.

bool_谓词匹配匹配一个或多个内部谓词(称为“子句”)的记录。它可用在构建有附加设置的_AND/_OR_运算符中。

The bool predicate matches documents that match one or more inner predicates, called "clauses". It can be used in particular to build AND/OR operators with additional settings.

内部谓词被添加为以下类型的子句:

Inner predicates are added as clauses of one of the following types:

must

要求 must 从句进行匹配:如果不匹配,则 bool 谓词将不匹配。

must clauses are required to match: if they don’t match, then the bool predicate will not match.

在进行 score 计算期间会考虑匹配“必须”子句。

Matching "must" clauses are taken into account during score computation.

mustNot

要求 mustNot 从句不匹配:如果匹配,则 bool 谓词将不匹配。

mustNot clauses are required to not match: if they match, then the bool predicate will not match.

在进行 score 计算期间会忽略“必须不”子句。

"must not" clauses are ignored during score computation.

filter

要求 filter 从句进行匹配:如果不匹配,则布尔谓词将不匹配。

filter clauses are required to match: if they don’t match, then the boolean predicate will not match.

在进行 score 计算期间会忽略 filter 子句,过滤器子句中包含的布尔谓词的任何子句(甚至 mustshould 子句)也是如此。

filter clauses are ignored during score computation, and so are any clauses of boolean predicates contained in the filter clause (even must or should clauses).

should

should 从句可以进行可选匹配,并且需要根据上下文进行匹配。

should clauses may optionally match, and are required to match depending on the context.

在进行 score 计算期间会考虑匹配 should 子句。

Matching should clauses are taken into account during score computation.

_should_子句的确切行为如下:

The exact behavior of should clauses is as follows:

如果_bool_谓词中没有任何_must_子句或没有任何_filter_子句,则至少需要匹配一个“should”子句。简而言之,在这种情况下,“should”子句的行为就像每个子句之间都有_OR_运算符一样。

When there isn’t any must clause nor any filter clause in the bool predicate then at least one "should" clause is required to match. Simply put, in this case, the "should" clauses behave as if there was an OR operator between each of them.

如果_bool_谓词中至少有一个_must_子句或一个_filter_子句,则不需要匹配“should”子句,并且仅用于评分。

When there is at least one must clause or one filter clause in the bool predicate, then the "should" clauses are not required to match, and are simply used for scoring.

可以通过指定 minimumShouldMatch constraints 更改此行为。

This behavior can be changed by specifying minimumShouldMatch constraints.

Emulating an OR operator

仅包含 should 从句而没有 minimumShouldMatch specificationbool 谓词的行为与 OR 运算符相同。在这种情况下,建议使用更简单的 or 语法。

A bool predicate with only should clauses and no minimumShouldMatch specification will behave as an OR operator. In such case, using the simpler or syntax is recommended.

Emulating an AND operator

仅包含 must 从句的 bool 谓词的行为与 AND 运算符相同。在这种情况下,建议使用更简单的 and 语法。

A bool predicate with only must clauses will behave as an AND operator. In such case, using the simpler and syntax is recommended.

mustNot: excluding documents that match a given predicate

示例 226.匹配不符合给定谓词的文档

. Example 226. Matching a document that does not match a given predicate

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool()
                .must( f.match().field( "title" )
                        .matching( "robot" ) ) (1)
                .mustNot( f.match().field( "description" )
                        .matching( "investigation" ) ) (2)
        )
        .fetchHits( 20 ); (3)
filter: matching documents that match a given predicate without affecting the score

filter 子句本质上是 must 子句,仅有一点区别:在计算某个文档的总 score 时,它们被忽略。

filter clauses are essentially must clauses with only one difference: they are ignored when computing the total score of a document.

示例 227. 匹配而不影响得分的一个命中给定谓词的文档

. Example 227. Matching a document that matches a given predicate without affecting the score

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool() (1)
                .should( f.bool() (2)
                        .filter( f.match().field( "genre" )
                                .matching( Genre.SCIENCE_FICTION ) ) (3)
                        .must( f.match().fields( "description" )
                                .matching( "crime" ) ) (4)
                )
                .should( f.bool() (5)
                        .filter( f.match().field( "genre" )
                                .matching( Genre.CRIME_FICTION ) ) (6)
                        .must( f.match().fields( "description" )
                                .matching( "robot" ) ) (7)
                )
        )
        .fetchHits( 20 ); (8)
should as a way to tune scoring

除了成为 used alone to emulate an OR operator 之外, should 子句还可以与 must 子句结合使用。这样做时, should 子句完全变成可选的,它们的唯一目的是增加命中这些子句的文档的得分。

Apart from being used alone to emulate an OR operator, should clauses can also be used in conjunction with must clauses. When doing so, the should clauses become completely optional, and their only purpose is to increase the score of documents that match these clauses.

示例 228. 使用可选 should 子句来提高某些文档的得分

. Example 228. Using optional should clauses to boost the score of some documents

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool()
                .must( f.match().field( "title" )
                        .matching( "robot" ) ) (1)
                .should( f.match().field( "description" )
                        .matching( "crime" ) ) (2)
                .should( f.match().field( "description" )
                        .matching( "investigation" ) ) (3)
        )
        .fetchHits( 20 ); (4)
minimumShouldMatch: fine-tuning how many should clauses are required to match

可以要求任意数量的 should 子句匹配才能使 bool 谓词匹配。这是 minimumShouldMatch* 方法的用途,如下所示。

It is possible to require that an arbitrary number of should clauses match in order for the bool predicate to match. This is the purpose of the minimumShouldMatch* methods, as demonstrated below.

示例 229. 通过 minimumShouldMatch 来微调匹配要求的 should 子句

. Example 229. Fine-tuning should clauses matching requirements with minimumShouldMatch

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool()
                .minimumShouldMatchNumber( 2 ) (1)
                .should( f.match().field( "description" )
                        .matching( "robot" ) ) (2)
                .should( f.match().field( "description" )
                        .matching( "investigation" ) ) (3)
                .should( f.match().field( "description" )
                        .matching( "disappearance" ) ) (4)
        )
        .fetchHits( 20 ); (5)
Adding clauses dynamically with the lambda syntax

可以在 Lambda 表达式内定义 bool 谓词。这在需要根据用户输入等因素动态地将子句添加到 bool 谓词中时尤为有用。

It is possible to define the bool predicate inside a lambda expression. This is especially useful when clauses need to be added dynamically to the bool predicate, for example based on user input.

. Example 230. Easily adding clauses dynamically using with(…​) and the lambda syntax

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( (f, root) -> { (2)
            root.add( f.matchAll() );
            if ( searchParameters.getGenreFilter() != null ) {
                root.add( f.match().field( "genre" )
                        .matching( searchParameters.getGenreFilter() ) );
            }
            if ( !searchParameters.getAuthorFilters().isEmpty() ) {
                root.add( f.bool().with( b -> { (3)
                    for ( String authorFilter : searchParameters.getAuthorFilters() ) { (4)
                        b.should( f.match().fields( "authors.firstName", "authors.lastName" )
                                .matching( authorFilter ) );
                    }
                } ) );
            }
        } )
        .fetchHits( 20 );
Deprecated variants

本节中详细介绍的功能是 deprecated :为了使用非弃用的替代方法,应避免使用它们。

Features detailed in this section are deprecated: they should be avoided in favor of non-deprecated alternatives.

通常 compatibility policy 适用,这表示预期的功能至少对到 Hibernate Search 的下一个主要版本仍然可用。除此之外,它们可能会以向后不兼容的方式进行更改,甚至会被移除。

The usual compatibility policy applies, meaning the features are expected to remain available at least until the next major version of Hibernate Search. Beyond that, they may be altered in a backward-incompatible way — or even removed.

不建议使用已弃用的功能。

Usage of deprecated features is not recommended.

可以使用另一种语法来 create a boolean predicate from a lambda expression,但它已被弃用。

Another syntax can be used to create a boolean predicate from a lambda expression, but it is deprecated.

示例 231. .bool 的弃用变体

. Example 231. Deprecated variant of .bool

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool( b -> { (2)
            b.must( f.matchAll() ); (3)
            if ( searchParameters.getGenreFilter() != null ) { (4)
                b.must( f.match().field( "genre" )
                        .matching( searchParameters.getGenreFilter() ) );
            }
            if ( searchParameters.getFullTextFilter() != null ) {
                b.must( f.match().fields( "title", "description" )
                        .matching( searchParameters.getFullTextFilter() ) );
            }
            if ( searchParameters.getPageCountMaxFilter() != null ) {
                b.must( f.range().field( "pageCount" )
                        .atMost( searchParameters.getPageCountMaxFilter() ) );
            }
        } ) )
        .fetchHits( 20 );
Other options
  1. The score of a bool predicate is variable by default, but can be made constant with .constantScore().

  2. The score of a bool predicate can be boosted with a call to .boost(…​).

15.2.16. simpleQueryString: match a user-provided query string

simpleQueryString 谓词根据给定的字符串形式的结构化查询匹配文档。

The simpleQueryString predicate matches documents according to a structured query given as a string.

它的语法非常简单,特别是在最终用户期望能够提交具有布尔运算符、引号等少数语法元素的文本查询时。

Its syntax is quite simple, so it’s especially helpful when end user expect to be able to submit text queries with a few syntax elements such as boolean operators, quotes, etc.

Boolean operators

该语法包含三个布尔运算符:

The syntax includes three boolean operators:

  1. AND using +

  2. OR using |

  3. NOT using -

示例 232. 匹配一个简单的查询字符串:AND/OR 运算符

. Example 232. Matching a simple query string: AND/OR operators

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "description" )
                .matching( "robots + (crime | investigation | disappearance)" ) )
        .fetchHits( 20 );
示例 233. 匹配一个简单的查询字符串:NOT 运算符

. Example 233. Matching a simple query string: NOT operator

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "description" )
                .matching( "robots + -investigation" ) )
        .fetchHits( 20 );
Default boolean operator

默认情况下,如果未明确定义运算符,则查询使用 OR 运算符。如果您更喜欢将 AND 运算符用作默认运算符,则可以调用 .defaultOperator(…​)

By default, the query uses the OR operator if the operator is not explicitly defined. If you prefer using the AND operator as default, you can call .defaultOperator(…​).

示例 234. 匹配一个简单的查询字符串:AND 作为默认运算符

. Example 234. Matching a simple query string: AND as default operator

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "description" )
                .matching( "robots investigation" )
                .defaultOperator( BooleanOperator.AND ) )
        .fetchHits( 20 );
Prefix

该语法包含通过 * 通配符支持前缀谓词。

The syntax includes support for prefix predicates through the * wildcard.

示例 235. 匹配一个简单的查询字符串:前缀

. Example 235. Matching a simple query string: prefix

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "description" )
                .matching( "rob*" ) )
        .fetchHits( 20 );
Fuzzy

语法包括对模糊运算符 ~ 的支持。它的行为类似 fuzzy matching in the match predicate

The syntax includes support for the fuzzy operator ~. Its behavior is similar to that of fuzzy matching in the match predicate.

示例 236. 匹配一个简单的查询字符串:模糊

. Example 236. Matching a simple query string: fuzzy

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "description" )
                .matching( "robto~2" ) )
        .fetchHits( 20 );
Phrase

语法包括对 phrase predicates 的支持,它使用引号将术语序列括起来以匹配。

The syntax includes support for phrase predicates using quotes around the sequence of terms to match.

示例 237. 匹配一个简单的查询字符串:短语

. Example 237. Matching a simple query string: phrase

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "\"robots of dawn\"" ) )
        .fetchHits( 20 );

可以使用 NEAR 运算符 _~_将 slop 分配给短语谓词。

A slop can be assigned to a phrase predicate using the NEAR operator ~.

示例 238. 匹配一个简单的查询字符串:包含间隙的短语

. Example 238. Matching a simple query string: phrase with slop

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "\"dawn robot\"~3" ) )
        .fetchHits( 20 );
flags: enabling only specific syntax constructs

默认情况下,所有语法功能均已启用。您可以通过 .flags(…​) 方法显式选择要启用的运算符。

By default, all syntax features are enabled. You can pick the operators to enable explicitly through the .flags(…​) method.

示例 239. 匹配简单查询字符串:仅启用特定语法结构

. Example 239. Matching a simple query string: enabling only specific syntax constructs

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "I want a **robot**" )
                .flags( SimpleQueryFlag.AND, SimpleQueryFlag.OR, SimpleQueryFlag.NOT ) )
        .fetchHits( 20 );

如果您愿意,可以禁用所有语法结构:

If you wish, you can disable all syntax constructs:

示例 240. 匹配简单查询字符串:禁用所有语法结构

. Example 240. Matching a simple query string: disabling all syntax constructs

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "**robot**" )
                .flags( Collections.emptySet() ) )
        .fetchHits( 20 );
minimumShouldMatch: fine-tuning how many should clauses are required to match

从查询字符串解析出的结果查询可能会导致具有 should 子句的布尔查询。控制将多少 should 子句匹配为匹配的文档可能会有所帮助。

The resulting query parsed from the query string may result in a boolean query with should clauses. It may be helpful to control how many should clauses must match to consider a document as a match.

示例 241. 使用 minimumShouldMatch 微调与 should 子句匹配的要求

. Example 241. Fine-tuning should clauses matching requirements with minimumShouldMatch

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "crime robot investigate automatic detective" )
                .minimumShouldMatchNumber( 2 ) )
        .fetchHits( 20 );

这类似于 booleanquery string 谓词 minimumShouldMatch 选项。

This is similar to boolean or query string predicates minimumShouldMatch options.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Field types and expected format of field values

此谓词适用于大多数 supported field types,但 GeoPointvector field 类型除外。

This predicate is applicable to most of the supported field types except GeoPoint and vector field types.

在查询字符串中使用的字符串文字的格式是特定于后端的。使用 Lucene 后端,这些文字的格式应与 Property types with built-in value bridges 中定义的解析逻辑兼容,对于带有自定义桥接器的字段,则为 must be defined 。至于 Elasticsearch 后端,请参阅 Field types supported by the Elasticsearch backend

The format of string literals used in the query string is backend-specific. With the Lucene backend, the format of these literals should be compatible with the parsing logic defined in Property types with built-in value bridges, and for fields with custom bridges it must be defined . As for the Elasticsearch backend, see the Field types supported by the Elasticsearch backend.

请记住,并非所有查询结构都可用于非字符串字段,例如,添加 fuzzinessslopwildcards 无效。

Keep in mind that not all query constructs can be applied to non-string fields, e.g. adding fuzziness, slop or wildcards will not work.

Other options
  1. The score of a simpleQueryString predicate is variable by default, but can be made constant with .constantScore().

  2. The score of a simpleQueryString predicate can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

  3. The simpleQueryString predicate uses the search analyzer of targeted fields to analyze searched text by default, but this can be overridden.

15.2.17. nested: match nested documents

nested 谓词可用于对象字段 indexed as nested documents,要求两个或更多内部谓词匹配 the same object。你可以这样确保 authors.firstname:isaac AND authors.lastname:asimov 不匹配作者为“Jane Asimov”和“Isaac Deutscher”的书。

The nested predicate can be used on object fields indexed as nested documents to require two or more inner predicates to match the same object. This is how you ensure that authors.firstname:isaac AND authors.lastname:asimov will not match a book whose authors are "Jane Asimov" and "Isaac Deutscher".

示例 242. 将多谓词与单个嵌套对象进行匹配

. Example 242. Matching multiple predicates against a single nested object

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.nested( "authors" ) (1)
                .add( f.match().field( "authors.firstName" )
                        .matching( "isaac" ) ) (2)
                .add( f.match().field( "authors.lastName" )
                        .matching( "asimov" ) ) ) (3)
        .fetchHits( 20 ); (4)
Implicit nesting

Hibernate Search 会在必要时自动将嵌套谓词包装在其他谓词中。然而,这是为每个单独谓词完成的,因此隐式嵌套不会产生与显式嵌套分组多个内部谓词相同行为。有关示例,请参见下面。

Hibernate Search automatically wraps a nested predicate around other predicates when necessary. However, this is done for each single predicate, so implicit nesting will not give the same behavior as explicit nesting grouping multiple inner predicates. See below for an example.

示例 243. 使用隐式嵌套

. Example 243. Using implicit nesting

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.and()
                .add( f.match().field( "authors.firstName" ) (1)
                        .matching( "isaac" ) ) (2)
                .add( f.match().field( "authors.lastName" )
                        .matching( "asimov" ) ) ) (3)
        .fetchHits( 20 ); (4)
Deprecated variants

本节中详细介绍的功能是 deprecated :为了使用非弃用的替代方法,应避免使用它们。

Features detailed in this section are deprecated: they should be avoided in favor of non-deprecated alternatives.

通常 compatibility policy 适用,这表示预期的功能至少对到 Hibernate Search 的下一个主要版本仍然可用。除此之外,它们可能会以向后不兼容的方式进行更改,甚至会被移除。

The usual compatibility policy applies, meaning the features are expected to remain available at least until the next major version of Hibernate Search. Beyond that, they may be altered in a backward-incompatible way — or even removed.

不建议使用已弃用的功能。

Usage of deprecated features is not recommended.

可以使用另一种语法创建嵌套谓词,但该语法更繁琐且已弃用。

Another syntax can be used to create a nested predicate, but it is more verbose and deprecated.

示例 244. 弃用的 .nested 变体

. Example 244. Deprecated variant of .nested

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.nested().objectField( "authors" ) (1)
                .nest( f.and()
                        .add( f.match().field( "authors.firstName" )
                                .matching( "isaac" ) ) (2)
                        .add( f.match().field( "authors.lastName" )
                                .matching( "asimov" ) ) ) ) (3)
        .fetchHits( 20 ); (4)

15.2.18. within: match points within a circle, box, polygon

within 谓词用于匹配给定字段是包含在给定圆、边界框或多边形内的地理位置的文档。

The within predicate matches documents for which a given field is a geo-point contained within a given circle, bounding-box or polygon.

Matching points within a circle (within a distance to a point)

.circle(…​) 中,匹配的点必须在某个距离内,该距离由某个点(中心)指定。

With .circle(…​), the matched points must be within a given distance from a given point (center).

示例 245. 匹配圆圈内的点

. Example 245. Matching points within a circle

GeoPoint center = GeoPoint.of( 53.970000, 32.150000 );
List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.spatial().within().field( "placeOfBirth.coordinates" )
                .circle( center, 50, DistanceUnit.KILOMETERS ) )
        .fetchHits( 20 );

还可以将中心坐标作为两个双精度值(纬度,然后是经度)传递。

You can also pass the coordinates of the center as two doubles (latitude, then longitude).

示例 246. 匹配圆圈内的点:传递中心坐标作为双精度值

. Example 246. Matching points within a circle: passing center coordinates as doubles

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.spatial().within().field( "placeOfBirth.coordinates" )
                .circle( 53.970000, 32.150000, 50, DistanceUnit.KILOMETERS ) )
        .fetchHits( 20 );
Matching points within a bounding box

借助 .boundingBox(…​),匹配的点必须位于由左上角和右下角定义的给定边界框内。

With .boundingBox(…​), the matched points must be within a given bounding box defined by its top-left and bottom-right corners.

示例 247. 匹配框内的点

. Example 247. Matching points within a box

GeoBoundingBox box = GeoBoundingBox.of(
        53.99, 32.13,
        53.95, 32.17
);
List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.spatial().within().field( "placeOfBirth.coordinates" )
                .boundingBox( box ) )
        .fetchHits( 20 );

还可以将左上角和右下角的坐标作为四个双精度值进行传递:左上角纬度、左上角经度、右下角纬度、右下角经度。

You can also pass the coordinates of the top-left and bottom-right corners as four doubles: top-left latitude, top-left longitude, bottom-right latitude, bottom-right longitude.

示例 248. 匹配框内的点:传递角坐标作为双精度值

. Example 248. Matching points within a box: passing corner coordinates as doubles

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.spatial().within().field( "placeOfBirth.coordinates" )
                .boundingBox( 53.99, 32.13,
                        53.95, 32.17 ) )
        .fetchHits( 20 );
Matching points within a polygon

借助 .polygon(…​),匹配的点必须位于给定多边形内。

With .polygon(…​), the matched points must be within a given polygon.

示例 249. 匹配多边形内的点

. Example 249. Matching points within a polygon

GeoPolygon polygon = GeoPolygon.of(
        GeoPoint.of( 53.976177, 32.138627 ),
        GeoPoint.of( 53.986177, 32.148627 ),
        GeoPoint.of( 53.979177, 32.168627 ),
        GeoPoint.of( 53.876177, 32.159627 ),
        GeoPoint.of( 53.956177, 32.155627 ),
        GeoPoint.of( 53.976177, 32.138627 )
);
List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.spatial().within().field( "placeOfBirth.coordinates" )
                .polygon( polygon ) )
        .fetchHits( 20 );
Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Other options
  1. The score of a within predicate is constant and equal to 1 by default, but can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .circle(…​)/.boundingBox(…​)/.polygon(…​).

15.2.19. knn: K-Nearest Neighbors a.k.a. vector search

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

knn 谓词(k 是正整数)匹配给定向量字段的值最“接近”给定向量的 k 文档。

The knn predicate, with k being a positive integer, matches the k documents for which a given vector field’s value is "nearest" to a given vector.

距离是根据为给定 vector field 配置的矢量相似性来衡量的。

Distance is measured based on the vector similarity configured for the given vector field.

示例 250. 简化 K 近邻搜索

. Example 250. Simple K-Nearest Neighbors search

float[] coverImageEmbeddingsVector = /*...*/
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.knn( 5 ).field( "coverImageEmbeddings" ).matching( coverImageEmbeddingsVector ) )
        .fetchHits( 20 );
Expected type of arguments

knn 谓词预期 matching(…​) 方法的参数具有与目标字段的索引类型相同的类型。

The knn predicate expects arguments to the matching(…​) method to have the same type as the index type of a target field.

例如,如果实体属性在索引中映射为字节数组类型 (byte[]) , .matching(…​) 将希望其参数仅为字节数组 (byte[]) 。

For example, if an entity property is mapped in the index to a byte array type (byte[]) , .matching(…​) will expect its argument to be a byte array (byte[]) only.

Filtering the neighbors

此外,该谓词可以使用谓词的 .filter(..) 子句过滤掉某些相邻元素。.filter(…​) 预期对其传递一个谓词。

Optionally, the predicate can filter out some of the neighbors using the .filter(..) clause of the predicate. .filter(…​) expects a predicate to be passed to it.

示例 251. 使用过滤器的 K 近邻搜索

. Example 251. K-Nearest Neighbors search with a filter

float[] coverImageEmbeddingsVector = /*...*/
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.knn( 5 ).field( "coverImageEmbeddings" ).matching( coverImageEmbeddingsVector )
                .filter( f.match().field( "authors.firstName" ).matching( "isaac" ) ) )
        .fetchHits( 20 );

knn 谓词可以与常规文本搜索谓词相结合。它可以根据向量嵌入特性来提升更相关的文档的评分,从而提升搜索结果的质量:

A knn predicate can be combined with the regular text-search predicates. It can improve the quality of search results by increasing the score of documents that are more relevant based on vector embeddings characteristics:

示例 252. 使用 K 近邻搜索丰富常规文本搜索

. Example 252. Enriching regular text search with K-Nearest Neighbors search

float[] coverImageEmbeddingsVector = /*...*/
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.bool()
                .must( f.match().field( "genre" ).matching( Genre.SCIENCE_FICTION ) ) (1)
                .should( f.knn( 10 ).field( "coverImageEmbeddings" ).matching( coverImageEmbeddingsVector ) ) (2)
        )
        .fetchHits( 20 );
Filtering out irrelevant results with knn similarity

根据其特性,knn 谓词将始终尝试查找 k 最近向量,即使找到的向量之间的距离很远,即不那么相似。这可能导致查询返回不相关的结果。

By its nature a knn predicate will always try to find k nearest vectors, even if the found vectors are quite far away from each other, i.e. are not that similar. This may lead to getting irrelevant results returned by the query.

为了解决这个问题,knn 谓词允许配置所需的最小相似性。如果已配置,knn 谓词将查找 k 最近的矢量,并过滤掉任何相似性低于此已配置阈值的矢量。请注意,此属性的预期值是根据已配置的 vector similarity 中的两个矢量之间的距离值。

To address this, a knn predicate allows configuring the minimum required similarity. If configured, knn predicate will find k nearest vectors and filter out any that will have similarity lower than this configured threshold. Note that the expected value for this property is a distance value between two vectors according to configured vector similarity.

示例 253. 过滤无关结果

. Example 253. Filtering out irrelevant results

float[] coverImageEmbeddingsVector = /*...*/
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.knn( 5 ).field( "coverImageEmbeddings" ).matching( coverImageEmbeddingsVector ) (1)
                .requiredMinimumSimilarity( 5 ) ) (2)
        .fetchHits( 20 );
Backend specifics and limitations

k 参数的不同后端之间的行为及其分布不同。

The parameter k has different behavior between backends, and their distributions.

使用 k Lucene backend 是将限制由 knn predicate 匹配的最终文档数量的数字。当使用 Elastic 分布的 Elasticsearch backend 时, k 将同时被视为 knum_candidates 。有关更多详情,请参见 Elasticsearch documentation 。当使用 OpenSearch 分布时, k 将映射到 knn 查询的 k 属性。请注意,在这种情况下,当索引配置为具有多个分片时,你可能得到超过 k 的结果。有关更多详情,请参见 OpenSearch documentation 的本部分。

With the Lucene backend k is the number that will limit the final amount of documents matched by a knn predicate. When using the Elastic distribution of the Elasticsearch backend k will be treated as both k and num_candidates. See the Elasticsearch documentation for more details. While when an OpenSearch distribution is used, k will be mapped to k attribute of knn query. Note that in this case you may get more than k results, when the index is configured to have more than one shard. See this section of the OpenSearch documentation for more details.

在使用 Elasticsearch 后端时,在嵌套谓词中使用 knn 谓词存在一些限制。特别是,当隐式应用 tenantrouting 过滤器时,生成的结果可能包含比预期的更少的文档。为了解决这个限制,需要进行架构更改,并且应该在未来的某个主要版本中解决此问题 ( HSEARCH-5085)。

Using the knn predicate inside a nested predicate with the Elasticsearch backend has some limitations. In particular, when a tenant or routing filters are implicitly applied, the produced results may contain fewer documents than expected. To address this limitation, a schema change is required and should be addressed in one of the future major releases (HSEARCH-5085).

Other options
  1. The score of a knn predicate is variable by default (higher for "nearer" documents), but can be made constant with .constantScore().

  2. The score of a knn predicate can be boosted for the whole predicate with a call to .boost(…​) after .matching(…​).

15.2.20. queryString: match a user-provided query string

queryString 谓词根据给定的字符串形式的结构化查询对文档进行匹配。与 simpleQueryString 谓词相比,它允许构建更复杂的查询字符串并配置更多的选项。

The queryString predicate matches documents according to a structured query given as a string. It allows building more advanced query strings and has more configuration options than a simpleQueryString predicate.

本指南不会详细介绍查询语法。为了熟悉它,请参阅你的后端 ( Elasticsearch/ OpenSearch/ Lucene) 指南。

We will not go much into details of a query syntax in this guide To familiarize yourself with it, please refer to your backend (Elasticsearch/OpenSearch/Lucene) guides.

示例 254. 匹配查询字符串

. Example 254. Matching a query string

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "description" )
                .matching(
                        "robots +(crime investigation disappearance)^10 +\"investigation help\"~2 -/(dis)?a[p]+ea?ance/" ) ) (1)
        .fetchHits( 20 );
Default boolean operator

默认情况下,如果未明确定义运算符,则查询使用 OR 运算符。如果您更喜欢将 AND 运算符用作默认运算符,则可以调用 .defaultOperator(…​)

By default, the query uses the OR operator if the operator is not explicitly defined. If you prefer using the AND operator as default, you can call .defaultOperator(…​).

示例 255. 匹配查询字符串:AND 作为默认运算符

. Example 255. Matching a query string: AND as default operator

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "description" )
                .matching( "robots investigation" )
                .defaultOperator( BooleanOperator.AND ) )
        .fetchHits( 20 );
Phrase slop

词组间隙选项定义了已构建词组谓词将允许的程度;换句话说,词组中允许多少个转置以仍然被视为匹配。使用查询谓词时,可以在查询字符串本身中设置此选项。

The phrase slop option defines how permissive a constructed phrase predicate will be; in other words, how many transpositions in the phrase are allowed for it to still be considered a match. With query predicate, this option can be set in the query string itself.

示例 256. 匹配查询字符串:短语间距作为查询字符串本身的一部分

. Example 256. Matching a query string: phrase slop as part of query string itself

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching( "\"dawn robot\"~3" ) )
        .fetchHits( 20 );

此外, .phraseSlop(…​) 可以应用到查询字符串谓词上。

Alternatively, .phraseSlop(…​) can be applied to a query string predicate.

示例 257. 匹配查询字符串:短语间距作为谓词选项

. Example 257. Matching a query string: phrase slop as predicate option

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching( "\"dawn robot\"" )
                .phraseSlop( 3 ) )
        .fetchHits( 20 );

请注意,将值传递到 .phraseSlop(…​) 会设置默认词组 slop 值,可以在查询字符串中覆盖。

Note that passing the value to .phraseSlop(…​) sets the default phrase slop value, that can be overridden in the query string.

示例 258. 匹配查询字符串:短语间距作为谓词选项,被查询覆盖

. Example 258. Matching a query string: phrase slop as predicate option overridden by query

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching( "\"dawn robot\"~3 -\"automatic detective\"" ) (1)
                .phraseSlop( 1 ) ) (2)
        .fetchHits( 20 );
Allowing leading wildcards

查询字符串默认情况下可以在查询中的任何位置使用通配符。如果需要阻止用户使用前导通配符,则可以调用 .allowLeadingWildcard(..) 并使用 false 值来禁止此类查询。

A query string can use a wildcard at any position within a query by default. If there is a need to prevent users from using leading wildcards, .allowLeadingWildcard(..) can be called with a false value to disallow such queries.

示例 259. 匹配查询字符串:禁止前置通配符

. Example 259. Matching a query string: disallowing leading wildcards

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching( "robo?" )
                .allowLeadingWildcard( false ) )
        .fetchHits( 20 );

请注意,此选项不仅会影响整个查询字符串,还会影响该查询字符串中的各个子句。例如,在查询字符串 robot ?etective 中,通配符 ? 不是前导字符,但是此查询被分解为 robot?etecitve 的两个子句,其中在第二个子句中,? 通配符成为前导字符。

Note that this option affects not just the whole query string, but individual clauses in that query string. For example, in a query string robot ?etective the wildcard ? is not a leading character, but this query is broken down as two clauses for robot and ?etecitve where in the second clause, the ? wildcard becomes a leading character.

Enabling position increments

位置增加默认情况下处于启用状态。位置增加默认情况下处于启用状态,这允许短语查询考虑由 stopwords 过滤器删除的停用词。位置增加可以按如下所示禁用,这会导致短语查询的行为发生更改:假设文档中有一个短语 book at the shelve,并且停用过滤器删除了 atthe,如果禁用了位置增加,则短语查询 "book shelve" 将不会匹配此类文档。

Position increments are enabled by default. Position increments are enabled by default, allowing phrase queries to take into account stop words removed by a stopwords filter. Position increments can be disabled as shown below, leading to a change in the behaviour of phrase queries: assuming that there is a phrase book at the shelve in the document, and the stop filter removes both at and the, with disabled position increments, phrase query "book shelve" will not match such document.

示例 260. 匹配查询字符串:禁用位置增量

. Example 260. Matching a query string: disabling position increments

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching( "\"crime robots\"" )
                .enablePositionIncrements( false ) )
        .fetchHits( 20 );
Rewrite method

重写方法决定了后端的查询解析器如何重写和给多词查询评分。

The rewrite method determines how the backend’s query parser rewrites and scores multi-term queries.

要更改默认 CONSTANT_SCORE 重写方法,可以在 .rewriteMethod(RewriteMethod)/rewriteMethod(RewriteMethod, int) 中传递允许使用的 RewriteMethod 枚举值。

To change the default CONSTANT_SCORE rewrite method, one of the allowed RewriteMethod enum values can be passed to .rewriteMethod(RewriteMethod)/rewriteMethod(RewriteMethod, int).

请注意,即使默认重写方法称为 CONSTANT_SCORE,但这并不意味着已匹配文档的最终评分将在所有结果中保持不变,它更多地与查询解析如何在内部工作有关。要为结果实现恒定的评分,请参阅 query string 谓词的 this documentation section

Note, even though the default rewrite method is called CONSTANT_SCORE it does not mean that the matched documents' final score will be constant across all results, it is more about how the query parsing works internally. To achieve a constant score for results, see this documentation section on the query string predicate.

本指南不会详细介绍不同的重写方法。要了解更多关于它们的信息,请参阅你的后端的指南 ( Elasticsearch/ OpenSearch/ Lucene)。

We will not go into details about different rewrite methods in this guide. To learn more about them, please refer to your backend’s guide (Elasticsearch/OpenSearch/Lucene).

示例 261. 匹配查询字符串:重写方法

. Example 261. Matching a query string: rewrite method

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.queryString().field( "title" )
                .matching(
                        // some complex query string
                )
                .rewriteMethod( RewriteMethod.CONSTANT_SCORE_BOOLEAN ) )
        .fetchHits( 20 );
minimumShouldMatch: fine-tuning how many should clauses are required to match

从查询字符串解析出的结果查询可能会导致具有 should 子句的布尔查询。控制将多少 should 子句匹配为匹配的文档可能会有所帮助。

The resulting query parsed from the query string may result in a boolean query with should clauses. It may be helpful to control how many should clauses must match to consider a document as a match.

示例 262. 利用 minimumShouldMatch 精调 should 条款匹配要求

. Example 262. Fine-tuning should clauses matching requirements with minimumShouldMatch

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.simpleQueryString().field( "title" )
                .matching( "crime robot investigate automatic detective" )
                .minimumShouldMatchNumber( 2 ) )
        .fetchHits( 20 );

这类似于 booleansimple query string 谓词 minimumShouldMatch 选项。

This is similar to boolean or simple query string predicates minimumShouldMatch options.

Targeting multiple fields

此外,谓词还可以针对多个字段。在这种情况下,谓词将匹配给定字段的 any 匹配的文档。

Optionally, the predicate can target multiple fields. In that case, the predicate will match documents for which any of the given fields matches.

Field types and expected format of field values

此谓词适用于大多数 supported field types,但 GeoPointvector field 类型除外。

This predicate is applicable to most of the supported field types except GeoPoint and vector field types.

在查询字符串中使用的字符串文字的格式是特定于后端的。使用 Lucene 后端,这些文字的格式应与 Property types with built-in value bridges 中定义的解析逻辑兼容,对于带有自定义桥接器的字段,则为 must be defined 。至于 Elasticsearch 后端,请参阅 Field types supported by the Elasticsearch backend

The format of string literals used in the query string is backend-specific. With the Lucene backend, the format of these literals should be compatible with the parsing logic defined in Property types with built-in value bridges, and for fields with custom bridges it must be defined . As for the Elasticsearch backend, see the Field types supported by the Elasticsearch backend.

请记住,并非所有查询结构都可以应用于非字符串字段,例如创建 regexp 查询、使用通配符/ 间距/ 模糊性都将不起作用。

Keep in mind that not all query constructs can be applied to non-string fields, e.g. creating regexp queries, using wildcards/slop/fuzziness will not work.

Other options
  1. The score of a queryString predicate is variable by default, but can be made constant with .constantScore().

  2. The score of a queryString predicate can be boosted, either on a per-field basis with a call to .boost(…​) just after .field(…​)/.fields(…​) or for the whole predicate with a call to .boost(…​) after .matching(…​).

  3. The queryString predicate uses the search analyzer of targeted fields to analyze searched text by default, but this can be overridden.

15.2.21. named: call a predicate defined in the mapping

可以调用一个 named 谓词,即在映射中定义的谓词,并将其包含在查询中。

A named predicate, i.e. a predicate defined in the mapping, can be called and included in a query.

下面是一个示例,用于调用第“1”部分示例中的指定谓词。

Below is an example that calls the named predicate from the example of section Defining named predicates.

示例 263. 调用命名谓词

. Example 263. Calling a named predicate

List<ItemStock> hits = searchSession.search( ItemStock.class )
        .where( f -> f.named( "skuId.skuIdMatch" ) (1)
                .param( "pattern", "*.WI2012" ) ) (2)
        .fetchHits( 20 );

15.2.22. withParameters: create predicates accessing query parameters

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

“2”谓词允许使用“3”构建谓词。当需要使用相同谓词但不同的输入值执行查询时,或者当用作查询参数传递的相同输入值在查询的多个部分(例如,谓词、投影、排序、聚合)中使用时,此谓词可能有所帮助。

The withParameters predicate allows building predicates using query parameters. This predicate can be helpful when there is a need to execute a query with the same predicate but different input values, or when the same input value, passed as a query parameter, is used in multiple parts of a query, e.g. in a predicate, projection, sort, aggregation.

这种类型的谓词需要一个函数,该函数接受查询参数并返回谓词。此函数将在查询构建时被调用。

This type of predicate requires a function that accepts query parameters and returns a predicate. That function will get called at query building time.

示例 264. 在查询参数中创建谓词

. Example 264. Creating a predicate with query parameters

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.withParameters( params -> f.bool() (1)
                .should( f.match().field( "title" )
                        .matching( params.get( "title-param", String.class ) ) ) (2)
                .filter( f.match().field( "genre" )
                        .matching( params.get( "genre-param", Genre.class ) ) ) (3)
        ) )
        .param( "title-param", "robot" ) (4)
        .param( "genre-param", Genre.CRIME_FICTION )
        .fetchHits( 20 );

15.2.23. Backend-specific extensions

通过在构建查询时调用 .extension(…​),可以访问后端特定的谓词。

By calling .extension(…​) while building a query, it is possible to access backend-specific predicates.

顾名思义,特定于后端的谓词无法从一种后端技术移植到另一种后端技术。

As their name suggests, backend-specific predicates are not portable from one backend technology to the other.

Lucene: fromLuceneQuery

.fromLuceneQuery(…​) 将本机 Lucene Query 转换为 Hibernate Search 谓词。

.fromLuceneQuery(…​) turns a native Lucene Query into a Hibernate Search predicate.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

示例 265. 匹配本机 org.apache.lucene.search.Query

. Example 265. Matching a native org.apache.lucene.search.Query

List<Book> hits = searchSession.search( Book.class )
        .extension( LuceneExtension.get() ) (1)
        .where( f -> f.fromLuceneQuery( (2)
                new RegexpQuery( new Term( "description", "neighbor|neighbour" ) )
        ) )
        .fetchHits( 20 );
Elasticsearch: fromJson

.fromJson(…​) 将表示 Elasticsearch 查询的 JSON 转换为 Hibernate Search 谓词。

.fromJson(…​) turns JSON representing an Elasticsearch query into a Hibernate Search predicate.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

示例 266. 匹配以 JsonObject 形式提供的本机 Elasticsearch JSON 查询

. Example 266. Matching a native Elasticsearch JSON query provided as a JsonObject

JsonObject jsonObject =
/* ... */; (1)
List<Book> hits = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() ) (2)
        .where( f -> f.fromJson( jsonObject ) ) (3)
        .fetchHits( 20 );
示例 267. 匹配以 JSON 格式字符串提供的本机 Elasticsearch JSON 查询

. Example 267. Matching a native Elasticsearch JSON query provided as a JSON-formatted string

List<Book> hits = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() ) (1)
        .where( f -> f.fromJson( "{" (2)
                + "    \"regexp\": {"
                + "        \"description\": \"neighbor|neighbour\""
                + "    }"
                + "}" ) )
        .fetchHits( 20 );

15.2.24. Options common to multiple predicate types

Targeting multiple fields in one predicate

一些谓词提供了在同一谓词中针对多个字段执行匹配的功能。

Some predicates offer the ability to target multiple fields in the same predicate.

在这种情况下,谓词将匹配给定字段中 any 匹配的文档。

In that case, the predicate will match documents for which any of the given fields matches.

下面是 match predicate 的一个示例。

Below is an example with the match predicate.

示例 268. 在多个字段中匹配一个值

. Example 268. Matching a value in any of multiple fields

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" ).field( "description" )
                .matching( "robot" ) )
        .fetchHits( 20 );
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .fields( "title", "description" )
                .matching( "robot" ) )
        .fetchHits( 20 );

可以分别提升每个字段的分数;请参见“5”。

It is possible to boost the score of each field separately; see Boosting the score of a predicate.

Tuning the score

如果每个谓语与文档匹配,则会产生一个得分。对给定谓语而言,文档越相关,得分越高。

Each predicate yields a score if it matched the document. The more relevant a document for a given predicate, the higher the score.

当“6”(它是默认值)用于在结果列表的顶部获取更相关的命中的时候,可以利用该分数。

That score can be used when sorting by score (which is the default) to get more relevant hits at the top of the result list.

以下是一些调整分数的方法,从而最大程度地提高相关性排序。

Below are a few ways to tune the score, and thus to get the most of the relevance sort.

Overriding analysis

在某些情况下,可能需要使用不同的分析器来分析搜索的文本,而不是用于分析已编制索引的文本的分析器。

In some cases it might be necessary to use a different analyzer to analyze searched text than the one used to analyze indexed text.

可以通过调用 .analyzer(…​) 并传递要使用的分析器的名称来实现此目的。

This can be achieved by calling .analyzer(…​) and passing the name of the analyzer to use.

下面是 match predicate 的一个示例。

Below is an example with the match predicate.

. Example 272. Matching a value, analyzing it with a different analyzer

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title_autocomplete" )
                .matching( "robo" )
                .analyzer( "autocomplete_query" ) )
        .fetchHits( 20 );

如果需要完全禁用搜索文本的分析,请调用 .skipAnalysis()

If you need to disable analysis of searched text completely, call .skipAnalysis().

示例 273. 不分析就匹配一个值

. Example 273. Matching a value without analyzing it

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match()
                .field( "title" )
                .matching( "robot" )
                .skipAnalysis() )
        .fetchHits( 20 );

15.3. Sort DSL

15.3.1. Basics

默认情况下,查询结果按“11”排序。在构建搜索查询时,可以配置其他排序,包括按字段值排序:

By default, query results are sorted by matching score (relevance). Other sorts, including the sort by field value, can be configured when building the search query:

示例 274. 使用自定义排序

. Example 274. Using custom sorts

SearchSession searchSession = /* ... */ (1)

List<Book> result = searchSession.search( Book.class ) (2)
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "pageCount" ).desc() (3)
                .then().field( "title_sort" ) )
        .fetchHits( 20 ); (4)

或者,如果您不想使用 lambdas:

Alternatively, if you don’t want to use lambdas:

示例 275. 使用自定义排序 — 面向对象的语法

. Example 275. Using custom sorts — object-based syntax

SearchSession searchSession = /* ... */

SearchScope<Book> scope = searchSession.scope( Book.class );

List<Book> result = searchSession.search( scope )
        .where( scope.predicate().matchAll().toPredicate() )
        .sort( scope.sort()
                .field( "pageCount" ).desc()
                .then().field( "title_sort" )
                .toSort() )
        .fetchHits( 20 );

要根据给定字段的值使用排序,需要在映射中将字段标记为 sortable

In order to use sorts based on the value of a given field, you need to mark the field as sortable in the mapping.

特别是对于全文字段(多字文本字段),这是不可能的;请参阅 here ,了解解释和一些解决方案。

This is not possible for full-text fields (multi-word text fields), in particular; see here for an explanation and some solutions.

排序 DSL 提供了更多的排序类型,以及每种排序类型的多个选项。要了解有关 field 排序和其他所有排序类型,请参阅以下部分。

The sort DSL offers more sort types, and multiple options for each type of sort. To learn more about the field sort, and all the other types of sort, refer to the following sections.

15.3.2. score: sort by matching score (relevance)

score 对每个文档的分数进行排序:

score sorts on the score of each document:

  1. in descending order (the default), documents with a higher score appear first in the list of hits.

  2. in ascending order, documents with a lower score appear first in the list of hits.

分数是针对每个查询分别计算的,但笼统地说,你可以认为更高的分数意味着匹配了更多“12”,或者匹配得更好。因此,给定文档的分数表示该文档与特定查询的相关程度。

The score is computed differently for each query, but roughly speaking you can consider that a higher score means that more predicates were matched, or they were matched better. Thus, the score of a given document is a representation of how relevant that document is to a particular query.

要充分利用按分数排序,你需要 assign weight to your predicates by boosting some of them

To get the most out of a sort by score, you will need to assign weight to your predicates by boosting some of them.

高级用户可能还想通过指定不同的 Similarity 来更改评分公式。

Advanced users may even want to change the scoring formula by specifying a different Similarity.

按分数排序是默认设置,因此通常不需要明确要求按分数排序,但以下是如何执行此操作的示例。

Sorting by score is the default, so it’s generally not necessary to ask for a sort by score explicitly, but below is an example of how you can do it.

示例 276. 按相关性排序

. Example 276. Sorting by relevance

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match().field( "title" )
                .matching( "robot dawn" ) )
        .sort( f -> f.score() )
        .fetchHits( 20 );
Options
  1. You can sort by ascending score by changing the sort order. However, this means the least relevant hits will appear first, which is completely pointless. This option is only provided for completeness.

15.3.3. indexOrder: sort according to the order of documents on storage

indexOrder 按文档在内部存储中的位置对文档进行排序。

indexOrder sorts on the position of documents on internal storage.

此排序不可预测,但非常高效。当性能比命中顺序更重要时,请使用它。

This sort is not predictable, but is the most efficient. Use it when performance matters more than the order of hits.

示例 277. 根据存储中的文档顺序排序

. Example 277. Sorting according to the order of documents on storage

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.indexOrder() )
        .fetchHits( 20 );

15.3.4. field: sort by field values

field 对每个文档的给定字段值进行排序。

field sorts on the value of a given field for each document.

要根据给定字段的值使用排序,需要在映射中将字段标记为 sortable

In order to use sorts based on the value of a given field, you need to mark the field as sortable in the mapping.

特别是对于全文字段(多字文本字段),这是不可能的;请参阅 here ,了解解释和一些解决方案。

This is not possible for full-text fields (multi-word text fields), in particular; see here for an explanation and some solutions.

GeoPoint 字段的值不能直接比较,因此 field 排序不能用于这些字段。

The values of GeoPoint fields cannot be compared directly and thus the field sort cannot be used on those fields.

请参考 distance sort 以了解这些字段。

Refer to the distance sort for these fields.

排序顺序定义如下:

The sort order is defined as follows:

  1. in ascending order (the default), documents with a lower value appear first in the list of hits.

  2. in descending order, documents with a higher value appear first in the list of hits.

对于文本字段,“较低”表示“按字母顺序靠前”。

For text fields, "lower" means "lower in the alphabetical order".

Syntax
示例 278. 按字段值排序

. Example 278. Sorting by field values

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "title_sort" ) )
        .fetchHits( 20 );
Options
  1. The sort order is ascending by default, but can be controlled explicitly with .asc()/.desc().

  2. The behavior on missing values can be controlled explicitly with .missing().

  3. The behavior on multivalued fields can be controlled explicitly with .mode(…​).

  4. For fields in nested objects, all nested objects are considered by default, but that can be controlled explicitly with .filter(…​).

15.3.5. distance: sort by distance to a point

distance 按给定中心到每个文档指定字段的地理点值的距离进行分类。

distance sorts on the distance from a given center to the geo-point value of a given field for each document.

  1. in ascending order (the default), documents with a lower distance appear first in the list of hits.

  2. in descending order, documents with a higher distance appear first in the list of hits.

Prerequisites

为了在给定字段上使用“13”排序,你需要在映射中将该字段标记为“14”。

In order for the distance sort to be available on a given field, you need to mark the field as sortable in the mapping.

Syntax
Example 279. 按与某个点的距离排序

. Example 279. Sorting by distance to a point

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.distance( "placeOfBirth", center ) )
        .fetchHits( 20 );
Options
  1. The sort order is ascending by default, but can be controlled explicitly with .asc()/.desc().

  2. The behavior on multivalued fields can be controlled explicitly with .mode(…​).

  3. For fields in nested objects, all nested objects are considered by default, but that can be controlled explicitly with .filter(…​).

15.3.6. withParameters: create sorts using query parameters

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

“15”排序允许使用“16”构建排序。

The withParameters sort allows building sorts using query parameters.

此类型的分类需要一个接受查询参数并返回分类的函数。该函数将在查询生成期间被调用。

This type of sort requires a function that accepts query parameters and returns a sort. That function will get called at query building time.

Example 280. 使用查询参数创建排序

. Example 280. Creating a sort with query parameters

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.withParameters( params -> f (1)
                .distance( "placeOfBirth", params.get( "center", GeoPoint.class ) ) ) ) (2)
        .param( "center", center ) (3)
        .fetchHits( 20 );

15.3.7. composite: combine sorts

composite 连续执行多个分类。在执行不完全分类时很有用。

composite applies multiple sorts one after the other. It is useful when applying incomplete sorts.

Example 281. 使用 composite() 按多个复合排序排序

. Example 281. Sorting by multiple composed sorts using composite()

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.composite() (1)
                .add( f.field( "genre_sort" ) ) (2)
                .add( f.field( "title_sort" ) ) ) (3)
        .fetchHits( 20 ); (4)

或者,您只需在第一个分类后调用 .then() 即可将分类附加到另一个分类:

Alternatively, you can append a sort to another simply by calling .then() after the first sort:

Example 282. 使用 then() 按多个复合排序排序

. Example 282. Sorting by multiple composed sorts using then()

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "genre_sort" )
                .then().field( "title_sort" ) )
        .fetchHits( 20 );
Adding sorts dynamically with the lambda syntax

可以在 lambda 表达式内定义 composite 分类。当需要根据用户输入的条件动态地将内部分类添加到 composite 分类时,此功能特别有用。

It is possible to define the composite sort inside a lambda expression. This is especially useful when inner sorts need to be added dynamically to the composite sort, for example based on user input.

Example 283. 使用 lambda 语法轻松动态组合排序

. Example 283. Easily composing sorts dynamically with the lambda syntax

MySearchParameters searchParameters = getSearchParameters(); (1)
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.composite( b -> { (2)
            for ( MySort mySort : searchParameters.getSorts() ) { (3)
                switch ( mySort.getType() ) {
                    case GENRE:
                        b.add( f.field( "genre_sort" ).order( mySort.getOrder() ) );
                        break;
                    case TITLE:
                        b.add( f.field( "title_sort" ).order( mySort.getOrder() ) );
                        break;
                    case PAGE_COUNT:
                        b.add( f.field( "pageCount" ).order( mySort.getOrder() ) );
                        break;
                }
            }
        } ) )
        .fetchHits( 20 ); (4)
Stabilizing a sort

如果你的第一个排序(例如,按“17”排序)导致了对许多文档的并列排序(例如,许多文档具有相同字段值),则可能需要追加一个任意排序来稳定你的排序:要确保搜索命中始终按照相同的顺序排列,如果执行相同的查询。

If your first sort (e.g. by field value) results in a tie for many documents (e.g. many documents have the same field value), you may want to append an arbitrary sort just to stabilize your sort: to make sure the search hits will always be in the same order if you execute the same query.

在大多数情况下,稳定排序的快速简便的解决方案是更改你的映射以在你的实体 ID 上添加“19”,并向你的不稳定排序追加按 ID 进行“18”排序:

In most cases, a quick and easy solution for stabilizing a sort is to change your mapping to add a sortable field on your entity ID, and to append a field sort by id to your unstable sort:

Example 284. 稳定排序

. Example 284. Stabilizing a sort

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "genre_sort" ).then().field( "id_sort" ) )
        .fetchHits( 20 );

15.3.8. Backend-specific extensions

在生成查询期间,通过调用 .extension(…​) 可以访问后端特定的分类。

By calling .extension(…​) while building a query, it is possible to access backend-specific sorts.

顾名思义,特定后端排序不可从一种后端技术移植到另一种。

As their name suggests, backend-specific sorts are not portable from one backend technology to the other.

Lucene: fromLuceneSort

.fromLuceneSort(…​) 将本机 Lucene Sort 转换为 Hibernate Search 排序。

.fromLuceneSort(…​) turns a native Lucene Sort into a Hibernate Search sort.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Example 285. 按一个本机 org.apache.lucene.search.Sort 排序

. Example 285. Sorting by a native org.apache.lucene.search.Sort

List<Book> hits = searchSession.search( Book.class )
        .extension( LuceneExtension.get() )
        .where( f -> f.matchAll() )
        .sort( f -> f.fromLuceneSort(
                new Sort(
                        new SortedSetSortField( "genre_sort", false ),
                        new SortedSetSortField( "title_sort", false )
                )
        ) )
        .fetchHits( 20 );
Lucene: fromLuceneSortField

.fromLuceneSortField(…​) 将本地 Lucene SortField 转换成 Hibernate 搜索分类。

.fromLuceneSortField(…​) turns a native Lucene SortField into a Hibernate Search sort.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Example 286. 按一个本机 org.apache.lucene.search.SortField 排序

. Example 286. Sorting by a native org.apache.lucene.search.SortField

List<Book> hits = searchSession.search( Book.class )
        .extension( LuceneExtension.get() )
        .where( f -> f.matchAll() )
        .sort( f -> f.fromLuceneSortField(
                new SortedSetSortField( "title_sort", false )
        ) )
        .fetchHits( 20 );
Elasticsearch: fromJson

.fromJson(…​) 将表示 Elasticsearch 分类 JSON 转换成 Hibernate 搜索分类。

.fromJson(…​) turns JSON representing an Elasticsearch sort into a Hibernate Search sort.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Example 287. 按一个提供为 JsonObject 的本机 Elasticsearch JSON 排序排序

. Example 287. Sorting by a native Elasticsearch JSON sort provided as a JsonObject

JsonObject jsonObject =
/* ... */;
List<Book> hits = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() )
        .where( f -> f.matchAll() )
        .sort( f -> f.fromJson( jsonObject ) )
        .fetchHits( 20 );
Example 288. 按一个作为 JSON 格式化字符串提供的本机 Elasticsearch JSON 排序排序

. Example 288. Sorting by a native Elasticsearch JSON sort provided as a JSON-formatted string

List<Book> hits = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() )
        .where( f -> f.matchAll() )
        .sort( f -> f.fromJson( "{"
                + "     \"title_sort\": \"asc\""
                + "}" ) )
        .fetchHits( 20 );

15.3.9. Options common to multiple sort types

Sort order

大多数排序默认使用升序,但“20”是一个明显的例外。

Most sorts use the ascending order by default, with the notable exception of the score sort.

顺序由以下选项明确控制:

The order controlled explicitly through the following options:

  1. .asc() for an ascending order.

  2. .desc() for a descending order.

  3. .order(…​) for an order defined by the given argument: SortOrder.ASC/SortOrder.DESC.

下面是带有“21”的几个示例。

Below are a few examples with the field sort.

Example 289. 使用 asc() 按字段值以显式升序顺序排序

. Example 289. Sorting by field values in explicitly ascending order with asc()

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "title_sort" ).asc() )
        .fetchHits( 20 );
Example 290. 使用 desc() 按字段值以显式降序顺序排序

. Example 290. Sorting by field values in explicitly descending order with desc()

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "title_sort" ).desc() )
        .fetchHits( 20 );
Example 291. 使用 order(…​) 按字段值以显式降序顺序排序

. Example 291. Sorting by field values in explicitly descending order with order(…​)

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "title_sort" ).order( SortOrder.DESC ) )
        .fetchHits( 20 );
Missing values

默认情况下:

By default:

  1. For sorts by field values, the documents that do not have any value for a sort field will appear in the last position.

  2. For sorts by distance to a point, the documents that do not have any value for a sort field will be treated as if their distance from the given point was infinite.

可以通过 .missing() 选项明确控制缺失值的处理行为:

The behavior for missing values can be controlled explicitly through the .missing() option:

  1. .missing().first() puts documents with no value in first position (regardless of the sort order).

  2. .missing().last() puts documents with no value in last position (regardless of the sort order).

  3. .missing().lowest() interprets missing values as the lowest value: it puts documents with no value in the first position when using ascending order or in the last position when using descending order.

  4. .missing().highest() interprets missing values as the highest value: it puts documents with no value in the last position when using ascending order or in the first position when using descending order.

  5. .missing().use(…​) uses the given value as a default for documents with no value.

所有这些选项都支持通过字段值以及通过使用 Lucene 后端的到某个点的距离进行排序。

All these options are supported for sorts by field values and sorts by distance to a point using the Lucene backend.

在使用 Elasticsearch 后端按到某个点的距离进行排序时,由于 Elasticsearch API 的限制,仅支持以下组合:

When sorting by distance to a point using the Elasticsearch backend, due to limitations of the Elasticsearch APIs, only the following combinations are supported:

.missing().first() 使用降序排列。

.missing().first() using a descending order.

.missing().last() 使用升序排列。

.missing().last() using an ascending order.

.missing().highest() 使用升序或降序排列。

.missing().highest() using either an ascending or a descending order.

下面是带有“21”的几个示例。

Below are a few examples with the field sort.

Example 292. 按字段值进行排序,没有值的文档处于第一个位置

. Example 292. Sorting by field values, documents with no value in first position

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "pageCount" ).missing().first() )
        .fetchHits( 20 );
Example 293. 按字段值进行排序,没有值的文档处于最后一个位置

. Example 293. Sorting by field values, documents with no value in last position

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "pageCount" ).missing().last() )
        .fetchHits( 20 );
Example 294. 按使用特定默认值的字段值进行排序

. Example 294. Sorting by field values, documents with no value using a given default value

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "pageCount" ).missing().use( 300 ) )
        .fetchHits( 20 );
Sort mode for multivalued fields

可以对具有多个值用于排序字段的文档进行排序。会为每个文档选取一个值以便将它与顺序文档进行比较。值的选取方式称为排序模式,使用 .mode(…​) 选项指定。以下排序模式可用:

Documents that have multiple values for a sort field can be sorted too. A single value is picked for each document in order to compare it with order documents. How the value is picked is called the sort mode, specified using the .mode(…​) option. The following sort modes are available:

Mode

Description

Supported value types

Unsupported value types

SortMode.MIN

Picks the lowest value for field sorts, the lowest distance for distance sorts.This is default for ascending sorts.

All.

-

SortMode.MAX

Picks the highest value for field sorts, the highest distance for distance sorts.This is default for descending sorts.

All.

-

SortMode.SUM

Computes the sum of all values for each document, and picks that sum for comparison with other documents.

Numeric fields (long, …​).

Text and temporal fields (String, LocalDate, …​), distance.

SortMode.AVG

Computes the arithmetic mean of all values for each document and picks that average for comparison with other documents.

Numeric and temporal fields (long, LocalDate, …​), distance.

Text fields (String, …​).

SortMode.MEDIAN

Computes the median of all values for each document, and picks that median for comparison with other documents.

Numeric and temporal fields (long, LocalDate, …​), distance.

Text fields (String, …​).

下面是带有“30”的示例。

Below is an example with the field sort.

示例 295. 使用每个文档的平均值按字段值排序

. Example 295. Sorting by field values using the average value for each document

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "books.pageCount" ).mode( SortMode.AVG ) )
        .fetchHits( 20 );
Filter for fields in nested objects

当排序字段位于“31”中时,默认情况下会考虑所有嵌套对象以便排序,并且会使用已配置的“32”合并它们的值。

When the sort field is located in a nested object, by default all nested objects will be considered for the sort and their values will be combined using the configured sort mode.

可以使用 filter(…​) 方法之一来过滤将考虑其值进行排序的嵌套文档。

It is possible to filter the nested documents whose values will be considered for the sort using one of the filter(…​) methods.

下面是带有“33”的示例:按作者的书籍的平均页数对作者进行排序,但只考虑“犯罪小说”类型的书籍:

Below is an example with the field sort: authors are sorted by the average page count of their books, but only books of the "crime fiction" genre are considered:

示例 296. 使用嵌套对象筛选器按字段值排序

. Example 296. Sorting by field values using a filter for nested objects

List<Author> hits = searchSession.search( Author.class )
        .where( f -> f.matchAll() )
        .sort( f -> f.field( "books.pageCount" )
                .mode( SortMode.AVG )
                .filter( pf -> pf.match().field( "books.genre" )
                        .matching( Genre.CRIME_FICTION ) ) )
        .fetchHits( 20 );

15.4. Projection DSL

15.4.1. Basics

对于某些用例,您只需要查询返回域对象中包含的一个小部分数据即可。在这些情况下,返回托管实体和从这些实体中提取数据可能会不必要:从索引本身中提取数据可以避免与数据库进行往返。

For some use cases, you only need the query to return a small subset of the data contained in your domain object. In these cases, returning managed entities and extracting data from these entities may be overkill: extracting the data from the index itself would avoid the database round-trip.

投影可以做到这一点:它们允许查询返回比“匹配实体”更精确的内容。构建搜索查询时可以配置投影:

Projections do just that: they allow the query to return something more precise than just "the matching entities". Projections can be configured when building the search query:

示例 297. 使用投影从索引中提取数据

. Example 297. Using projections to extract data from the index

SearchSession searchSession = /* ... */ (1)

List<String> result = searchSession.search( Book.class ) (2)
        .select( f -> f.field( "title", String.class ) ) (3)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (4)

或者,如果您不想使用 lambdas:

Alternatively, if you don’t want to use lambdas:

示例 298. 使用投影从索引中提取数据 - 基于对象的语法

. Example 298. Using projections to extract data from the index — object-based syntax

SearchSession searchSession = /* ... */

SearchScope<Book> scope = searchSession.scope( Book.class );

List<String> result = searchSession.search( scope )
        .select( scope.projection().field( "title", String.class )
                .toProjection() )
        .where( scope.predicate().matchAll().toPredicate() )
        .fetchHits( 20 );

为了基于给定字段的值来使用投影,您需要将字段标记为 projectable 中的映射。

In order to use projections based on the value of a given field, you need to mark the field as projectable in the mapping.

对于 Elasticsearch backend ,这是可选项,其中所有字段默认为可投影的。

This is optional with the Elasticsearch backend, where all fields are projectable by default.

虽然“34”投影当然是最常见的,但它们并不是唯一类型的投影。其他投影允许“35”,获取对“36”或“37”的引用,或获取与搜索查询本身(“38”……)相关的信息。

While field projections are certainly the most common, they are not the only type of projection. Other projections allow composing custom beans containing extracted data, getting references to the extracted documents or the corresponding entities, or getting information related to the search query itself (score, …​).

15.4.2. Projecting to a custom (annotated) type

对于更为复杂的投影,可以“39”,并让 Hibernate Search 从自定义类型的构造函数参数中推断出对应的投影。

For more complex projections, it is possible to define a custom (annotated) record or class and have Hibernate Search infer the corresponding projections from the custom type’s constructor parameters.

在注释自定义投影类型时,需要注意一些约束:

There are a few constraints to keep in mind when annotating a custom projection type:

如果自定义投影类型不在与实体类型相同的 JAR 中,则 Hibernate Search 将 require additional configuration

The custom projection type must be in the same JAR as entity types, or Hibernate Search will require additional configuration.

在对值字段或对象字段进行投影时,默认情况下,投影字段的路径从构造函数参数名称中推断,但 inference will fail if constructor parameter names are not included in the Java bytecode 。 或者,可以通过 @FieldProjection(path = …​) / @ObjectProjection(path = …​) 显式提供路径,在这种情况下,Hibernate Search 不会依赖于构造函数参数名称。

When projecting on value fields or object fields, the path to the projected field is inferred from the constructor parameter name by default, but inference will fail if constructor parameter names are not included in the Java bytecode. Alternatively the path can be provided explicitly through @FieldProjection(path = …​)/@ObjectProjection(path = …​), in which case Hibernate Search won’t rely on constructor parameter names.

在对值字段进行投影时, field 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的值字段配置为 projectable

When projecting on value fields, the constraints of the field projection still apply. In particular, with the Lucene backend, value fields involved in the projection must be configured as projectable.

在对对象字段进行投影时, object 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的多值对象字段配置为 nested

When projecting on object fields, the constraints of the object projection still apply. In particular, with the Lucene backend, multi-valued object fields involved in the projection must be configured as nested.

示例 299. 使用自定义记录类型从索引中投影数据

. Example 299. Using a custom record type to project data from the index

@ProjectionConstructor (1)
public record MyBookProjection(
        @IdProjection Integer id, (2)
        String title, (3)
        List<MyBookProjection.Author> authors) { (4)
    @ProjectionConstructor (5)
    public record Author(String firstName, String lastName) {
    }
}
List<MyBookProjection> hits = searchSession.search( Book.class )
        .select( MyBookProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

自定义的非记录类也可以用 @ProjectionConstructor 添加注解,如果您由于某些原因无法使用记录(例如您仍在使用 Java 13 或更低版本),这可能会很有用。

Custom, non-record classes can also be annotated with @ProjectionConstructor, which can be useful if you cannot use records for some reason (for example because you’re still using Java 13 or below).

除了 .select(Class<?>) ,一些投影还允许使用自定义投影类型;见 the composite projectionthe object projection 。有关映射投影类型的信息,请参阅 Mapping index content to custom types (projection constructors)

Beside .select(Class<?>), some projections also allow using custom projection types; see the composite projection and the object projection. For more information about mapping projection types, see Mapping index content to custom types (projection constructors).

15.4.3. documentReference: return references to matched documents

documentReference 投影将返回对匹配文档的引用,作为 DocumentReference 对象。

The documentReference projection returns a reference to the matched document as a DocumentReference object.

Syntax
示例 300. 返回与匹配文档的引用

. Example 300. Returning references to matched documents

List<DocumentReference> hits = searchSession.search( Book.class )
        .select( f -> f.documentReference() )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
@DocumentReferenceProjection in projections to custom types

要在 projection to an annotated custom type 内部实现 documentReference 投影,请使用 @DocumentReferenceProjection 注解:

To achieve a documentReference projection inside a projection to an annotated custom type, use the @DocumentReferenceProjection annotation:

示例 301. 在投影构造函数中返回与匹配文档的引用

. Example 301. Returning references to matched documents within a projection constructor

@ProjectionConstructor (1)
public record MyBookDocRefAndTitleProjection(
        @DocumentReferenceProjection (2)
        DocumentReference ref, (3)
        String title (4)
) {
}
List<MyBookDocRefAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookDocRefAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于“41”,请使用“40”。

For programmatic mapping, use DocumentReferenceProjectionBinder.create().

示例 302. 在投影构造函数中对 documentReference 投影进行编程映射

. Example 302. Programmatic mapping of a documentReference projection within a projection constructor

TypeMappingStep myBookDocRefAndTitleProjection =
        mapping.type( MyBookDocRefAndTitleProjection.class );
myBookDocRefAndTitleProjection.mainConstructor()
        .projectionConstructor();
myBookDocRefAndTitleProjection.mainConstructor().parameter( 0 )
        .projection( DocumentReferenceProjectionBinder.create() );

15.4.4. entityReference: return references to matched entities

entityReference 投影将返回对匹配实体的引用,作为 EntityReference 对象。

The entityReference projection returns a reference to the matched entity as an EntityReference object.

Syntax
示例 303. 返回与匹配实体的引用

. Example 303. Returning references to matched entities

List<? extends EntityReference> hits = searchSession.search( Book.class )
        .select( f -> f.entityReference() )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
@EntityReferenceProjection in projections to custom types

要在 projection to an annotated custom type 内部实现 entityReference 投影,请使用 @EntityReferenceProjection 注解:

To achieve an entityReference projection inside a projection to an annotated custom type, use the @EntityReferenceProjection annotation:

示例 304. 在投影构造函数中返回与匹配实体的引用

. Example 304. Returning references to matched entities within a projection constructor

@ProjectionConstructor (1)
public record MyBookEntityRefAndTitleProjection(
        @EntityReferenceProjection (2)
        EntityReference ref, (3)
        String title (4)
) {
}
List<MyBookEntityRefAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookEntityRefAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于“43”,请使用“42”。

For programmatic mapping, use EntityReferenceProjectionBinder.create().

示例 305. 在投影构造函数中对 entityReference 投影进行编程映射

. Example 305. Programmatic mapping of an entityReference projection within a projection constructor

TypeMappingStep myBookEntityRefAndTitleProjection =
        mapping.type( MyBookEntityRefAndTitleProjection.class );
myBookEntityRefAndTitleProjection.mainConstructor()
        .projectionConstructor();
myBookEntityRefAndTitleProjection.mainConstructor().parameter( 0 )
        .projection( EntityReferenceProjectionBinder.create() );

15.4.5. id: return identifiers of matched entities

id 投影返回匹配实体的标识符。

The id projection returns the identifier of the matched entity.

Syntax
示例 306. 返回与匹配实体的 ID,提供标识类型。

. Example 306. Returning ids to matched entities, providing the identity type.

List<Integer> hits = searchSession.search( Book.class )
        .select( f -> f.id( Integer.class ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

如果提供的标识符类型与目标实体类型的标识符类型不匹配,将会抛出异常。另请参见“44”。

If the provided identifier type does not match the type of identifiers for targeted entity types, an exception will be thrown. See also Type of projected values.

您可以省略“标识符类型”参数,但随后您将获得类型为 Object 的投影:

You can omit the "identifier type" argument, but then you will get projections of type Object:

示例 307. 返回匹配实体的 ID,不提供标识类型。

. Example 307. Returning ids to matched entities, without providing the identity type.

List<Object> hits = searchSession.search( Book.class )
        .select( f -> f.id() )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
@IdProjection in projections to custom types

要在“47”内实现“45”投影,请使用“46”注释:

To achieve an id projection inside a projection to an annotated custom type, use the @IdProjection annotation:

示例 308. 在投射构造函数中,返回匹配实体的 ID

. Example 308. Returning ids to matched entities within a projection constructor

@ProjectionConstructor (1)
public record MyBookIdAndTitleProjection(
        @IdProjection (2)
        Integer id, (3)
        String title) { (4)
}
List<MyBookIdAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookIdAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于 programmatic mapping,使用 IdProjectionBinder.create()

For programmatic mapping, use IdProjectionBinder.create().

示例 309. 在投射构造函数中,按编程方式映射 id 投射

. Example 309. Programmatic mapping of an id projection within a projection constructor

TypeMappingStep myBookIdAndTitleProjectionMapping =
        mapping.type( MyBookIdAndTitleProjection.class );
myBookIdAndTitleProjectionMapping.mainConstructor()
        .projectionConstructor();
myBookIdAndTitleProjectionMapping.mainConstructor().parameter( 0 )
        .projection( IdProjectionBinder.create() );

15.4.6. entity: return matched entities

entity 投影返回与匹配文档相对应的实体。

The entity projection returns the entity corresponding to the document that matched.

实体加载方式的确切信息取决于您的映射器和配置:

How the entity is loaded exactly depends on your mapper and configuration:

  1. With the Hibernate ORM integration, returned objects are managed entities loaded by Hibernate ORM from the database. You can use them as you would use any entity returned from traditional Hibernate ORM queries.

  2. With the Standalone POJO Mapper, entities are loaded from an external datastore if configured, or (failing that) are projected from the index if the entity type declares a projection constructor. If neither loading configuration nor projection constructor are found, the entity projection will simply fail.

Syntax

示例 310. 返回从数据库加载的匹配实体

. Example 310. Returning matched entities loaded from the database

List<Book> hits = searchSession.search( Book.class )
        .select( f -> f.entity() )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
Requesting a specific entity type

在一些(极少数)情况下,创建投影的代码可能必须使用 SearchProjectionFactory<?, ?>,即不携带关于已加载实体类型的任何信息的工厂。

In some (rare) cases, the code creating the projection may have to work with a SearchProjectionFactory<?, ?>, i.e. a factory that carries no information regarding the type of loaded entities.

在这些情况下,可以请求特定类型的实体:Hibernate Search 将在创建投影时检查请求的类型是否与加载的实体类型匹配。

In those cases, it’s possible to request a specific type of entities: Hibernate Search will check when the projection is created that the requested type matches the type of loaded entities.

示例 311. 为 entity 投射请求特定的实体类型

. Example 311. Requesting a specific entity type when for the entity projection

f.entity( Book.class )
@EntityProjection in projections to custom types

要在 projection to an annotated custom type 内实现 entity 投影,请使用 @EntityProjection 注释:

To achieve a entity projection inside a projection to an annotated custom type, use the @EntityProjection annotation:

示例 312. 在投射构造函数中,返回从数据库加载的匹配实体

. Example 312. Returning matched entities loaded from the database within a projection constructor

@ProjectionConstructor (1)
public record MyBookEntityAndTitleProjection(
        @EntityProjection (2)
        Book entity, (3)
        String title (4)
) {
}
List<MyBookEntityAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookEntityAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于 programmatic mapping,使用 EntityProjectionBinder.create()

For programmatic mapping, use EntityProjectionBinder.create().

示例 313. 在投射构造函数中,按编程方式映射 entity 投射

. Example 313. Programmatic mapping of an entity projection within a projection constructor

TypeMappingStep myBookEntityAndTitleProjection =
        mapping.type( MyBookEntityAndTitleProjection.class );
myBookEntityAndTitleProjection.mainConstructor()
        .projectionConstructor();
myBookEntityAndTitleProjection.mainConstructor().parameter( 0 )
        .projection( EntityProjectionBinder.create() );

15.4.7. field: return field values from matched documents

field 投影返回匹配文档的给定字段的值。

The field projection returns the value of a given field for the matched document.

为了基于给定字段的值来使用投影,您需要将字段标记为 projectable 中的映射。

In order to use projections based on the value of a given field, you need to mark the field as projectable in the mapping.

对于 Elasticsearch backend ,这是可选项,其中所有字段默认为可投影的。

This is optional with the Elasticsearch backend, where all fields are projectable by default.

Syntax

默认情况下,field 投影返回每个文档的单个值,因此以下代码将足以处理单值字段:

By default, the field projection returns a single value per document, so the code below will be enough for a single-valued field:

示例 314. 返回匹配文档的字段值

. Example 314. Returning field values from matched documents

List<Genre> hits = searchSession.search( Book.class )
        .select( f -> f.field( "genre", Genre.class ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

可以省略“字段类型”参数,但随后将获得 Object 类型的投影:

You can omit the "field type" argument, but then you will get projections of type Object:

示例 315. 返回匹配文档的字段值,不指定字段类型

. Example 315. Returning field values from matched documents, without specifying the field type

List<Object> hits = searchSession.search( Book.class )
        .select( f -> f.field( "genre" ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
Multivalued fields

要返回多个值(允许在多值字段上进行投影),请使用 .multi()。这将把投影的返回类型更改为 List<T>,其中 T 是单值投影返回的值。

To return multiple values, and thus allow projection on multivalued fields, use .multi(). This will change the return type of the projection to List<T> where T is what the single-valued projection would have returned.

示例 316. 为多值字段,返回匹配文档的字段值

. Example 316. Returning field values from matched documents, for multivalued fields

List<List<String>> hits = searchSession.search( Book.class )
        .select( f -> f.field( "authors.lastName", String.class ).multi() )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
Skipping conversion

默认情况下,由 field 投影返回的值与对应于目标字段的实体属性拥有相同的类型。

By default, the values returned by the field projection have the same type as the entity property corresponding to the target field.

例如,如果实体属性为枚举类型 the corresponding field may be of type String ;那么 field 投射会返回的枚举类型的值。

For example, if an entity property if of an enum type, the corresponding field may be of type String; the values returned by the field projection will be of the enum type regardless.

这通常是您所希望的,但是如果您需要绕过转换并让未转换的值返回给您(上述示例中为类型 String),您可以这样做:

This should generally be what you want, but if you ever need to bypass conversion and have unconverted values returned to you instead (of type String in the example above), you can do it this way:

示例 317. 返回匹配文档的字段值,不转换字段值

. Example 317. Returning field values from matched documents, without converting the field value

List<String> hits = searchSession.search( Book.class )
        .select( f -> f.field( "genre", String.class, ValueConvert.NO ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

请参阅 Type of projected values 以获取更多信息。

See Type of projected values for more information.

@FieldProjection in projections to custom types

要在 projection to an annotated custom type 内实现 field 投影,您可以依赖于默认 inferred projection:如果构造器参数没有注释,那么它将推断为字段投影,该字段的名称与构造器参数相同(或是一个 object projection,请参阅 here for details)。

To achieve a field projection inside a projection to an annotated custom type, you can rely on the default inferred projection: when no annotations are present on a constructor parameter, it will be inferred to a field projection to the field with the same name as the constructor parameter (or an object projection, see here for details).

要强制执行字段投影,或进一步自定义字段投影(例如显式设置字段路径),请在构造函数参数上使用 @FieldProjection 注释:

To force a field projection, or to customize the field projection further (for example to set the field path explicitly), use the @FieldProjection annotation on the constructor parameter:

示例 318. 在投射构造函数中,返回匹配文档的字段值

. Example 318. Returning field values from matched documents within a projection constructor

@ProjectionConstructor (1)
public record MyBookTitleAndAuthorNamesProjection(
        @FieldProjection (2)
        String title, (3)
        @FieldProjection(path = "authors.lastName") (4)
        List<String> authorLastNames (5)
) {
}
List<MyBookTitleAndAuthorNamesProjection> hits = searchSession.search( Book.class )
        .select( MyBookTitleAndAuthorNamesProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

该注释公开以下特性:

The annotation exposes the following attributes:

path

投射字段的路径。

The path to the projected field.

如果未设置,它将从构造函数参数名称中推断。

If not set, it is inferred from the constructor parameter name.

convert

如何从索引检索到的 convert 值。

How to convert values retrieved from the index.

对于 programmatic mapping,使用 FieldProjectionBinder.create()

For programmatic mapping, use FieldProjectionBinder.create().

示例 319. 在投射构造函数中,按编程方式映射 field 投射

. Example 319. Programmatic mapping of a field projection within a projection constructor

TypeMappingStep myBookTitleAndAuthorNamesProjectionMapping =
        mapping.type( MyBookTitleAndAuthorNamesProjection.class );
myBookTitleAndAuthorNamesProjectionMapping.mainConstructor()
        .projectionConstructor();
myBookTitleAndAuthorNamesProjectionMapping.mainConstructor().parameter( 0 )
        .projection( FieldProjectionBinder.create() );
myBookTitleAndAuthorNamesProjectionMapping.mainConstructor().parameter( 1 )
        .projection( FieldProjectionBinder.create( "authors.lastName" ) );

15.4.8. score: return the score of matched documents

score 投影返回匹配文档的 score

The score projection returns the score of the matched document.

Syntax

示例 320. 返回匹配文档的得分

. Example 320. Returning the score of matched documents

List<Float> hits = searchSession.search( Book.class )
        .select( f -> f.score() )
        .where( f -> f.match().field( "title" )
                .matching( "robot dawn" ) )
        .fetchHits( 20 );

只有在完全相同的查询执行过程中计算出两个得分,这两个得分才能可靠地进行比较。尝试比较两次单独查询执行的得分只会导致令人困惑的结果,特别是当谓词不同或索引内容发生足够的变化以显著改变某些术语的频率时。

Two scores can only be reliably compared if they were computed during the very same query execution. Trying to compare scores from two separate query executions will only lead to confusing results, in particular if the predicates are different or if the content of the index changed enough to alter the frequency of some terms significantly.

在相关方面,向最终用户公开评分通常不是一项简单的任务。有关显示评分为百分比的具体原因,请参阅 this article 以获取一些见解。

On a related note, exposing scores to end users is generally not an easy task. See this article for some insight into what’s wrong with displaying the score as a percentage, specifically.

@ScoreProjection in projections to custom types

要在 projection to an annotated custom type 内实现 score 投影,请使用 @ScoreProjection 注释:

To achieve a score projection inside a projection to an annotated custom type, use the @ScoreProjection annotation:

示例 321. 在投影生成器中返回匹配文档的评分

. Example 321. Returning the score of matched documents within a projection constructor

@ProjectionConstructor (1)
public record MyBookScoreAndTitleProjection(
        @ScoreProjection (2)
        float score, (3)
        String title (4)
) {
}
List<MyBookScoreAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookScoreAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于 programmatic mapping,使用 ScoreProjectionBinder.create()

For programmatic mapping, use ScoreProjectionBinder.create().

示例 322. 在投影生成器内按程序映射一个 score 投影

. Example 322. Programmatic mapping of a score projection within a projection constructor

TypeMappingStep myBookScoreAndTitleProjection =
        mapping.type( MyBookScoreAndTitleProjection.class );
myBookScoreAndTitleProjection.mainConstructor()
        .projectionConstructor();
myBookScoreAndTitleProjection.mainConstructor().parameter( 0 )
        .projection( ScoreProjectionBinder.create() );

15.4.9. distance: return the distance to a point

distance 投影返回给定点与匹配文档中给定字段的地理点值之间的距离。

The distance projection returns the distance between a given point and the geo-point value of a given field for the matched document.

为了基于给定字段的值来使用投影,您需要将字段标记为 projectable 中的映射。

In order to use projections based on the value of a given field, you need to mark the field as projectable in the mapping.

对于 Elasticsearch backend ,这是可选项,其中所有字段默认为可投影的。

This is optional with the Elasticsearch backend, where all fields are projectable by default.

Syntax

默认情况下,distance 投影返回每个文档的单个值,因此以下代码将足以处理单值字段:

By default, the distance projection returns a single value per document, so the code below will be enough for a single-valued field:

示例 323. 返回到一个点的距离

. Example 323. Returning the distance to a point

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
SearchResult<Double> result = searchSession.search( Author.class )
        .select( f -> f.distance( "placeOfBirth", center ) )
        .where( f -> f.matchAll() )
        .fetch( 20 );

默认情况下,返回的距离单位为米,但可以选择其他单位:

The returned distance is in meters by default, but you can pick a different unit:

示例 324. 返回到一个点的距离(使用给定的距离单位)

. Example 324. Returning the distance to a point with a given distance unit

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
SearchResult<Double> result = searchSession.search( Author.class )
        .select( f -> f.distance( "placeOfBirth", center )
                .unit( DistanceUnit.KILOMETERS ) )
        .where( f -> f.matchAll() )
        .fetch( 20 );
Multivalued fields

要返回多个值,从而允许在多值字段上进行投影,请使用 .multi()。这会将投影的返回类型更改为 List<Double>

To return multiple values, and thus allow projection on multivalued fields, use .multi(). This will change the return type of the projection to List<Double>.

示例 325. 返回到一个点的距离,适用于多值字段

. Example 325. Returning the distance to a point, for multivalued fields

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
SearchResult<List<Double>> result = searchSession.search( Book.class )
        .select( f -> f.distance( "authors.placeOfBirth", center ).multi() )
        .where( f -> f.matchAll() )
        .fetch( 20 );
@DistanceProjection in projections to custom types

要在 projection to an annotated custom type 内实现 distance 投影,请在构造器参数上使用 @DistanceProjection 注释:

To achieve a distance projection inside a projection to an annotated custom type, use the @DistanceProjection annotation on the constructor parameter:

示例 326. 在投影生成器内返回到一个中心点的距离,该中心点被定义为匹配文档中字段值的参数

. Example 326. Returning distance from a center point defined as a parameter to the field value in matched documents within a projection constructor

@ProjectionConstructor (1)
public record MyAuthorPlaceProjection(
        @DistanceProjection( (2)
                fromParam = "point-param", (3)
                path = "placeOfBirth") (4)
        Double distance ) { (5)
}
List<MyAuthorPlaceProjection> hits = searchSession.search( Author.class )
        .select( MyAuthorPlaceProjection.class )(1)
        .where( f -> f.matchAll() )
        .param( "point-param", GeoPoint.of( latitude, longitude ) ) (2)
        .fetchHits( 20 ); (3)

该注释公开以下特性:

The annotation exposes the following attributes:

fromParam

将代表某个点的 query parameter 的名称,距该点的距离将从此计算得出。

The name of a query parameter that will represent a point, from which the distance to the field value will be calculated.

这是必需属性。

This is a required attribute.

path

投射字段的路径。

The path to the projected field.

如果未设置,它将从构造函数参数名称中推断。

If not set, it is inferred from the constructor parameter name.

unit

已计算距离的单位(默认为米)。

The unit of the computed distance (default is meters).

对于 programmatic mapping,使用 DistanceProjectionBinder.create(..)

For programmatic mapping, use DistanceProjectionBinder.create(..).

示例 327. 在投影生成器内按程序映射一个 distance 投影

. Example 327. Programmatic mapping of a distance projection within a projection constructor

TypeMappingStep myAuthorPlaceProjection =
        mapping.type( MyAuthorPlaceProjection.class );
myAuthorPlaceProjection.mainConstructor()
        .projectionConstructor();
myAuthorPlaceProjection.mainConstructor().parameter( 0 )
        .projection( DistanceProjectionBinder.create( "placeOfBirth", "point-param" ) );

15.4.10. composite: combine projections

Basics

composite 投影应用多个投影并组合其结果,可以作为一个 List<?> 也可能是一个使用自定义转换器生成的单个对象。

The composite projection applies multiple projections and combines their results, either as a List<?> or as a single object generated using a custom transformer.

为了保留类型安全性,你可以提供一个自定义转换器。根据内部投影的数量,转换器可以是 FunctionBiFunctionorg.hibernate.search.util.common.function.TriFunction。它会接收内部投影返回的值并返回一个组合这些值的的对象。

To preserve type-safety, you can provide a custom transformer. The transformer can be a Function, a BiFunction, or a org.hibernate.search.util.common.function.TriFunction, depending on the number of inner projections. It will receive values returned by inner projections and return an object combining these values.

示例 328. 利用 .composite().from(…​).as(…​) 从多个投影值创建返回自定义对象

. Example 328. Returning custom objects created from multiple projected values with .composite().from(…​).as(…​)

List<MyPair<String, Genre>> hits = searchSession.search( Book.class )
        .select( f -> f.composite() (1)
                .from( f.field( "title", String.class ), (2)
                        f.field( "genre", Genre.class ) ) (3)
                .as( MyPair::new ) )(4)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (5)
Composing more than 3 inner projections

对于复杂投影,请考虑 projecting to a custom (annotated) type

For complex projections, consider projecting to a custom (annotated) type.

如果将超过 3 个投影作为参数传递给 from(…​),则变换函数必须将 List<?> 作为参数,并且将使用 asList(…​) 而不是 as(..,) 进行设置:

If you pass more than 3 projections as arguments to from(…​), then the transform function will have to take a List<?> as an argument, and will be set using asList(…​) instead of as(..,):

示例 329. 利用 .composite().from(…​).asList(…​) 从多个投影值创建返回自定义对象

. Example 329. Returning custom objects created from multiple projected values with .composite().from(…​).asList(…​)

List<MyTuple4<String, Genre, Integer, String>> hits = searchSession.search( Book.class )
        .select( f -> f.composite() (1)
                .from( f.field( "title", String.class ), (2)
                        f.field( "genre", Genre.class ), (3)
                        f.field( "pageCount", Integer.class ), (4)
                        f.field( "description", String.class ) ) (5)
                .asList( list -> (6)
                    new MyTuple4<>( (String) list.get( 0 ), (Genre) list.get( 1 ),
                            (Integer) list.get( 2 ), (String) list.get( 3 ) ) ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (7)
Projecting to a List<?> or Object[]

如果你不介意将内部投影的结果接收为 List<?>,你可以通过调用 asList() 而无需转换器:

If you don’t mind receiving the result of inner projections as a List<?>, you can do without the transformer by calling asList():

示例 330. 利用 .composite().add(…​).asList() 返回 List 投影值

. Example 330. Returning a List of projected values with .composite().add(…​).asList()

List<List<?>> hits = searchSession.search( Book.class )
        .select( f -> f.composite() (1)
                .from( f.field( "title", String.class ), (2)
                        f.field( "genre", Genre.class ) ) (3)
                .asList() ) (4)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (5)

同样,要作为数组获取内部投影的结果 (Object[]),您可以通过调用 asArray() 来不使用转换器:

Similarly, to get the result of inner projections as an array (Object[]), you can do without the transformer by calling asArray():

示例 331. 利用 .composite(…​).add(…​).asArray() 返回一个投影值数组

. Example 331. Returning an array of projected values with .composite(…​).add(…​).asArray()

List<Object[]> hits = searchSession.search( Book.class )
        .select( f -> f.composite() (1)
                .from( f.field( "title", String.class ), (2)
                        f.field( "genre", Genre.class ) ) (3)
                .asArray() ) (4)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (5)

或者,要以 List<?> 的形式获取结果,可以使用 .composite(…​) 的较短变体,直接将投影作为参数传递:

Alternatively, to get the result as a List<?>, you can use the shorter variant of .composite(…​) that directly takes projections as arguments:

示例 332. 利用 .composite(…​) 返回 List 投影值

. Example 332. Returning a List of projected values with .composite(…​)

List<List<?>> hits = searchSession.search( Book.class )
        .select( f -> f.composite( (1)
                f.field( "title", String.class ), (2)
                f.field( "genre", Genre.class ) (3)
        ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (4)
Projecting to a custom (annotated) type

对于更为复杂的复合投影,可以定义一个自定义的(带注释的)记录或类,并让 Hibernate Search 从自定义类型的构造函数参数推断相应的内部投影。这类似于 projection to a custom (annotated) type through .select(…​)

For more complex composite projections, it is possible to define a custom (annotated) record or class and have Hibernate Search infer the corresponding inner projections from the custom type’s constructor parameters. This is similar to the projection to a custom (annotated) type through .select(…​).

在注释自定义投影类型时,需要注意一些约束:

There are a few constraints to keep in mind when annotating a custom projection type:

如果自定义投影类型不在与实体类型相同的 JAR 中,则 Hibernate Search 将 require additional configuration

The custom projection type must be in the same JAR as entity types, or Hibernate Search will require additional configuration.

在对值字段或对象字段进行投影时,默认情况下,投影字段的路径从构造函数参数名称中推断,但 inference will fail if constructor parameter names are not included in the Java bytecode 。 或者,可以通过 @FieldProjection(path = …​) / @ObjectProjection(path = …​) 显式提供路径,在这种情况下,Hibernate Search 不会依赖于构造函数参数名称。

When projecting on value fields or object fields, the path to the projected field is inferred from the constructor parameter name by default, but inference will fail if constructor parameter names are not included in the Java bytecode. Alternatively the path can be provided explicitly through @FieldProjection(path = …​)/@ObjectProjection(path = …​), in which case Hibernate Search won’t rely on constructor parameter names.

在对值字段进行投影时, field 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的值字段配置为 projectable

When projecting on value fields, the constraints of the field projection still apply. In particular, with the Lucene backend, value fields involved in the projection must be configured as projectable.

在对对象字段进行投影时, object 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的多值对象字段配置为 nested

When projecting on object fields, the constraints of the object projection still apply. In particular, with the Lucene backend, multi-valued object fields involved in the projection must be configured as nested.

示例 333. 使用自定义记录类型从索引投影数据

. Example 333. Using a custom record type to project data from the index

@ProjectionConstructor (1)
public record MyBookProjection(
        @IdProjection Integer id, (2)
        String title, (3)
        List<MyBookProjection.Author> authors) { (4)
    @ProjectionConstructor (5)
    public record Author(String firstName, String lastName) {
    }
}
List<MyBookProjection> hits = searchSession.search( Book.class )
        .select( f -> f.composite() (1)
                .as( MyBookProjection.class ) )(2)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (3)

自定义的非记录类也可以用 @ProjectionConstructor 添加注解,如果您由于某些原因无法使用记录(例如您仍在使用 Java 13 或更低版本),这可能会很有用。

Custom, non-record classes can also be annotated with @ProjectionConstructor, which can be useful if you cannot use records for some reason (for example because you’re still using Java 13 or below).

@CompositeProjection in projections to custom types

要在 projection to an annotated custom type 内部实现 composite 投影,请在构造函数参数上使用 @CompositeProjection 注解:

To achieve a composite projection inside a projection to an annotated custom type, use the @CompositeProjection annotation on the constructor parameter:

示例 334. 在投影生成器内从多个投影创建返回自定义对象

. Example 334. Returning custom objects created from multiple projections within a projection constructor

@ProjectionConstructor (1)
public record MyBookMiscInfoAndTitleProjection(
        @CompositeProjection (2)
        MiscInfo miscInfo, (3)
        String title (4)
) {

    @ProjectionConstructor (3)
    public record MiscInfo(
            Genre genre,
            Integer pageCount
    ) {
    }
}
List<MyBookMiscInfoAndTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookMiscInfoAndTitleProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

对于 programmatic mapping,使用 CompositeProjectionBinder.create()

For programmatic mapping, use CompositeProjectionBinder.create().

示例 335. 在投影生成器内按程序映射一个 composite 投影

. Example 335. Programmatic mapping of a composite projection within a projection constructor

TypeMappingStep myBookMiscInfoAndTitleProjection =
        mapping.type( MyBookMiscInfoAndTitleProjection.class );
myBookMiscInfoAndTitleProjection.mainConstructor()
        .projectionConstructor();
myBookMiscInfoAndTitleProjection.mainConstructor().parameter( 0 )
        .projection( CompositeProjectionBinder.create() );
TypeMappingStep miscInfoProjection =
        mapping.type( MyBookMiscInfoAndTitleProjection.MiscInfo.class );
miscInfoProjection.mainConstructor().projectionConstructor();
Deprecated variants

本节中详细介绍的功能是 deprecated :为了使用非弃用的替代方法,应避免使用它们。

Features detailed in this section are deprecated: they should be avoided in favor of non-deprecated alternatives.

通常 compatibility policy 适用,这表示预期的功能至少对到 Hibernate Search 的下一个主要版本仍然可用。除此之外,它们可能会以向后不兼容的方式进行更改,甚至会被移除。

The usual compatibility policy applies, meaning the features are expected to remain available at least until the next major version of Hibernate Search. Beyond that, they may be altered in a backward-incompatible way — or even removed.

不建议使用已弃用的功能。

Usage of deprecated features is not recommended.

可以在 SearchProjectionFactory 上使用一些同时接受函数和投影列表的 .composite(…​) 方法,但它们已被弃用。

A few .composite(…​) methods accepting both a function and a list of projections are available on SearchProjectionFactory, but they are deprecated.

示例 336. composite 的已弃用变体

. Example 336. Deprecated variant of composite

List<MyPair<String, Genre>> hits = searchSession.search( Book.class )
        .select( f -> f.composite( (1)
                MyPair::new, (2)
                f.field( "title", String.class ), (3)
                f.field( "genre", Genre.class ) (4)
        ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (5)

15.4.11. object: return one value per object in an object field

object 投影对给定对象字段中的每个对象产生一个投影值,该值通过应用多个内部投影并组合其结果(作为 List<?> 或作为使用自定义转换器生成的单个对象)生成。

The object projection yields one projected value for each object in a given object field, the value being generated by applying multiple inner projections and combining their results either as a List<?> or as a single object generated using a custom transformer.

object 投影看起来可能与 composite projection 非常相似,而通过 Search DSL 定义它当然确实类似。

The object projection may seem very similar to the composite projection, and its definition via the Search DSL certainly is indeed similar.

但是,有两个主要区别:

However, there are two key differences:

当在单值对象字段上投影时,如果对象在索引建立时为空,则 object 投影将产生 null

The object projection will yield null when projecting on a single-valued object field if the object was null when indexing.

当在多重值对象字段上投影时,如果在索引建立时有多个对象,则 object 投影将产生多个值。

The object projection will yield multiple values when projecting on a multivalued object field if there were multiple objects when indexing.

对于 Lucene backend ,对象投影有一些限制:

With the Lucene backend, the object projection has a few limitations:

无论其 structure 是单值对象字段还是具有 NESTED structure 的多值对象字段,它都仅可用于这些字段。

It is only available for single-valued object fields regardless of their structure, or multi-valued object fields with a NESTED structure.

它不会对多值对象字段产生 null 对象。Lucene 后端不编制 null 对象的索引,因此无法在搜索时找到它们。

It will never yield null objects for multi-valued object fields. The Lucene backend does not index null objects, and thus cannot find them when searching.

这些限制不适用于 Elasticsearch backend

These limitations do not apply to the Elasticsearch backend.

Syntax

为了保留类型安全性,你可以提供一个自定义转换器。根据内部投影的数量,转换器可以是 FunctionBiFunctionorg.hibernate.search.util.common.function.TriFunction。它会接收内部投影返回的值并返回一个组合这些值的的对象。

To preserve type-safety, you can provide a custom transformer. The transformer can be a Function, a BiFunction, or a org.hibernate.search.util.common.function.TriFunction, depending on the number of inner projections. It will receive values returned by inner projections and return an object combining these values.

示例 337. 返回通过具有 .object(…​).from(…​).as(…​) 的对象字段创建的自定义对象

. Example 337. Returning custom objects created from an object field with .object(…​).from(…​).as(…​)

List<List<MyAuthorName>> hits = searchSession.search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .from( f.field( "authors.firstName", String.class ), (2)
                        f.field( "authors.lastName", String.class ) ) (3)
                .as( MyAuthorName::new ) (4)
                .multi() ) (5)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (6)
Composing more than 3 inner projections

对于复杂投影,请考虑 projecting to a custom (annotated) type

For complex projections, consider projecting to a custom (annotated) type.

如果将超过 3 个投影作为参数传递,则变换函数必须将 List<?> 作为参数,并且将使用 asList(…​) 而不是 as(..,) 进行设置:

If you pass more than 3 projections as arguments, then the transform function will have to take a List<?> as an argument, and will be set using asList(…​) instead of as(..,):

示例 338. 返回通过具有 .object(…​).from(…​).asList(…​) 的对象字段创建的自定义对象

. Example 338. Returning custom objects created from an object field with .object(…​).from(…​).asList(…​)

GeoPoint center = GeoPoint.of( 53.970000, 32.150000 );
List<List<MyAuthorNameAndBirthDateAndPlaceOfBirthDistance>> hits = searchSession
        .search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .from( f.field( "authors.firstName", String.class ), (2)
                        f.field( "authors.lastName", String.class ), (3)
                        f.field( "authors.birthDate", LocalDate.class ), (4)
                        f.distance( "authors.placeOfBirth", center ) (5)
                                .unit( DistanceUnit.KILOMETERS ) )
                .asList( list -> (6)
                        new MyAuthorNameAndBirthDateAndPlaceOfBirthDistance(
                                (String) list.get( 0 ), (String) list.get( 1 ),
                                (LocalDate) list.get( 2 ), (Double) list.get( 3 ) ) )
                .multi() ) (7)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (8)

同样, asArray(…​) 可用于传递 Object[] 参数,而不是 List<?>

Similarly, asArray(…​) can be used to get passed an Object[] argument instead of List<?>.

示例 339. 返回通过具有 .object(…​).from(…​).asArray(…​) 的对象字段创建的自定义对象

. Example 339. Returning custom objects created from an object field with .object(…​).from(…​).asArray(…​)

GeoPoint center = GeoPoint.of( 53.970000, 32.150000 );
List<List<MyAuthorNameAndBirthDateAndPlaceOfBirthDistance>> hits = searchSession
        .search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .from( f.field( "authors.firstName", String.class ), (2)
                        f.field( "authors.lastName", String.class ), (3)
                        f.field( "authors.birthDate", LocalDate.class ), (4)
                        f.distance( "authors.placeOfBirth", center ) (5)
                                .unit( DistanceUnit.KILOMETERS ) )
                .asArray( array -> (6)
                        new MyAuthorNameAndBirthDateAndPlaceOfBirthDistance(
                                (String) array[0], (String) array[1],
                                (LocalDate) array[2], (Double) array[3] ) )
                .multi() ) (7)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (8)
Projecting to a List<?> or Object[]

如果你不介意将内部投影的结果接收为 List<?>,你可以通过调用 asList() 而无需转换器:

If you don’t mind receiving the result of inner projections as a List<?>, you can do without the transformer by calling asList():

示例 340. 返回使用 .object(…​).add(…​).asList() 的投影值 List

. Example 340. Returning a List of projected values with .object(…​).add(…​).asList()

List<List<List<?>>> hits = searchSession.search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .from( f.field( "authors.firstName", String.class ), (2)
                        f.field( "authors.lastName", String.class ) ) (3)
                .asList() (4)
                .multi() ) (5)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (6)

同样,要作为数组获取内部投影的结果 (Object[]),您可以通过调用 asArray() 来不使用转换器:

Similarly, to get the result of inner projections as an array (Object[]), you can do without the transformer by calling asArray():

示例 341. 返回使用 .object(…​).add(…​).asArray() 的投影值的数组

. Example 341. Returning an array of projected values with .object(…​).add(…​).asArray()

List<List<Object[]>> hits = searchSession.search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .from( f.field( "authors.firstName", String.class ), (2)
                        f.field( "authors.lastName", String.class ) ) (3)
                .asArray() (4)
                .multi() ) (5)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (6)
Projecting to a custom (annotated) type

对于更复杂的对象投影,可以定义自定义(带注释的)记录或类,并让 Hibernate Search 从自定义类型的构造函数参数中推断出相应的内部投影。这类似于 projection to a custom (annotated) type through .select(…​)

For more complex object projections, it is possible to define a custom (annotated) record or class and have Hibernate Search infer the corresponding inner projections from the custom type’s constructor parameters. This is similar to the projection to a custom (annotated) type through .select(…​).

在注释自定义投影类型时,需要注意一些约束:

There are a few constraints to keep in mind when annotating a custom projection type:

如果自定义投影类型不在与实体类型相同的 JAR 中,则 Hibernate Search 将 require additional configuration

The custom projection type must be in the same JAR as entity types, or Hibernate Search will require additional configuration.

在对值字段或对象字段进行投影时,默认情况下,投影字段的路径从构造函数参数名称中推断,但 inference will fail if constructor parameter names are not included in the Java bytecode 。 或者,可以通过 @FieldProjection(path = …​) / @ObjectProjection(path = …​) 显式提供路径,在这种情况下,Hibernate Search 不会依赖于构造函数参数名称。

When projecting on value fields or object fields, the path to the projected field is inferred from the constructor parameter name by default, but inference will fail if constructor parameter names are not included in the Java bytecode. Alternatively the path can be provided explicitly through @FieldProjection(path = …​)/@ObjectProjection(path = …​), in which case Hibernate Search won’t rely on constructor parameter names.

在对值字段进行投影时, field 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的值字段配置为 projectable

When projecting on value fields, the constraints of the field projection still apply. In particular, with the Lucene backend, value fields involved in the projection must be configured as projectable.

在对对象字段进行投影时, object 投影的约束仍然适用。 特别是,对于 Lucene backend ,必须将涉及投影的多值对象字段配置为 nested

When projecting on object fields, the constraints of the object projection still apply. In particular, with the Lucene backend, multi-valued object fields involved in the projection must be configured as nested.

示例 342. 使用自定义记录类型投影通过对象字段创建的数据

. Example 342. Using a custom record type to project data created from an object field

@ProjectionConstructor (1)
public record MyAuthorProjection(String firstName, String lastName) { (2)
}
List<List<MyAuthorProjection>> hits = searchSession.search( Book.class )
        .select( f -> f.object( "authors" ) (1)
                .as( MyAuthorProjection.class ) (2)
                .multi() ) (3)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (4)

自定义的非记录类也可以用 @ProjectionConstructor 添加注解,如果您由于某些原因无法使用记录(例如您仍在使用 Java 13 或更低版本),这可能会很有用。

Custom, non-record classes can also be annotated with @ProjectionConstructor, which can be useful if you cannot use records for some reason (for example because you’re still using Java 13 or below).

@ObjectProjection in projections to custom types

要在 projection to an annotated custom type 内部实现 object 投影,可以依赖默认值 inferred projection:当构造函数参数上不存在注释时,它将被推断为对与构造函数参数同名字段的对象投影(或 field projection,请参见 here for details)。

To achieve an object projection inside a projection to an annotated custom type, you can rely on the default inferred projection: when no annotations are present on a constructor parameter, it will be inferred to an object projection to the field with the same name as the constructor parameter (or a field projection, see here for details).

要强制使用对象投影,或进一步自定义对象投影(例如明确设置字段路径),请对构造函数参数使用 @ObjectProjection 注释:

To force an object projection, or to customize the object projection further (for example to set the field path explicitly), use the @ObjectProjection annotation on the constructor parameter:

示例 343. 在投影构造函数中返回通过对象字段创建的自定义对象

. Example 343. Returning custom objects created from an object field within a projection constructor

@ProjectionConstructor (1)
public record MyBookTitleAndAuthorsProjection(
        @ObjectProjection (2)
        List<MyAuthorProjection> authors, (3)
        @ObjectProjection(path = "mainAuthor") (4)
        MyAuthorProjection theMainAuthor, (5)
        String title (6)
) {
}
List<MyBookTitleAndAuthorsProjection> hits = searchSession.search( Book.class )
        .select( MyBookTitleAndAuthorsProjection.class )(1)
        .where( f -> f.matchAll() )
        .fetchHits( 20 ); (2)

该注释公开以下特性:

The annotation exposes the following attributes:

path

The path to the projected field.

如果未设置,它将从构造函数参数名称中推断。

If not set, it is inferred from the constructor parameter name.

includePaths

excludePaths

includeDepth

对于 programmatic mapping,使用 ObjectProjectionBinder.create()

For programmatic mapping, use ObjectProjectionBinder.create().

示例 344. object 投影在投影构造函数中的编程映射

. Example 344. Programmatic mapping of an object projection within a projection constructor

TypeMappingStep myBookTitleAndAuthorsProjection =
        mapping.type( MyBookTitleAndAuthorsProjection.class );
myBookTitleAndAuthorsProjection.mainConstructor()
        .projectionConstructor();
myBookTitleAndAuthorsProjection.mainConstructor().parameter( 0 )
        .projection( ObjectProjectionBinder.create() );
myBookTitleAndAuthorsProjection.mainConstructor().parameter( 1 )
        .projection( ObjectProjectionBinder.create( "mainAuthor" ) );
@ObjectProjection filters to exclude nested projections and break @ObjectProjection cycles

默认情况下, @ObjectProjectioninferred object projections 将递归地包含在投影类型投影构造函数中遇到的每个投影。

By default, @ObjectProjection and inferred object projections will include every projection encountered in the projection constructor of the projected type, recursively.

对于简单的用例,这将非常适用,但对于更复杂的模型可能会导致问题:

This will work just fine for simpler use cases, but may lead to problems for more complex models:

  1. If the projection constructor of the projected type declares many nested projections, only some of which are actually useful to the "surrounding" type, the extra projections will decrease search performance needlessly.

  2. If there is a cycle of @ObjectProjection (e.g. A includes a nested object projection b of type B, which includes a nested projection a of type A) the root projection type will end up with an infinite amount of fields (a.b.someField, a.b.a.b.someField, a.b.a.b.a.b.someField, …​), which Hibernate Search will detect and reject with an exception.

为解决这些问题,可以应用过滤器,以便仅包括实际上有用的那些嵌套投影。在运行时,已排除字段上的投影的值将被设置为 null,或对于多值投影,则为一个空列表。

To address these problems, it is possible to apply filters to only include those nested projections that are actually useful. Projections on excluded fields, at runtime, will have their value set to null, or an empty list for multivalued projections.

@ObjectProjection 中可用过滤属性有:

Available filtering attributes on @ObjectProjection are:

includePaths

要包含的嵌套索引字段路径,即将从索引中实际检索相应嵌套投影的字段路径。

The paths of nested index field to be included, i.e. for which the corresponding nested projections will actually be retrieved from the index.

提供的路径必须相对于投影对象字段,即它们不能包括其 path

Provided paths must be relative to the projected object field, i.e. they must not include its path.

这优先于 includeDepth(见下文)。

This takes precedence over includeDepth (see below).

不能与 excludePaths 在同一 @ObjectProjection 中结合使用。

Cannot be used in combination with excludePaths in the same @ObjectProjection.

excludePaths

不得嵌入的索引嵌入式元素的索引字段的路径。

The paths of index fields from the indexed-embedded element that must not be embedded.

提供的路径必须相对于投影对象字段,即它们不能包括其 path

Provided paths must be relative to the projected object field, i.e. they must not include its path.

这优先于 includeDepth(见下文)。

This takes precedence over includeDepth (see below).

不能与 includePaths 在同一 @ObjectProjection 中结合使用。

Cannot be used in combination with includePaths in the same @ObjectProjection.

includeDepth

默认情况下,将包含所有字段的所有级别的索引嵌入式。

The number of levels of indexed-embedded that will have all their fields included by default.

includeDepth 是将包含所有嵌套字段/对象投影并实际上从索引中检索的对象投影的级别数。

includeDepth is the number of levels of object projections that will have all their nested field/object projections included by default and actually be retrieved from the index.

在此深度以内(含该深度),即使没有通过 includePaths 明确包括这些字段,对象投影也将连同其嵌套(非对象)字段投影一起包括在内,除非通过 excludePaths 明确排除这些字段:

Up to and including that depth, object projections will be included along with their nested (non-object) field projections, even if these fields are not included explicitly through includePaths, unless these fields are excluded explicitly through excludePaths:

includeDepth=0 意味着此对象投影的字段未包含,嵌套索引嵌入元素的任何字段也不包含,除非明确通过 includePaths 包含这些字段。

includeDepth=0 means fields of this object projection are not included, nor is any field of nested indexed-embedded elements, unless these fields are included explicitly through includePaths.

includeDepth=1 意味着此对象投影的字段已包含,除非明确通过 excludePaths 排除这些字段,但不是嵌套对象投影的字段(此 @ObjectProjection 中的 @ObjectProjection ),除非明确通过 includePaths 包含这些字段。

includeDepth=1 means fields of this object projection are included, unless these fields are excluded explicitly through excludePaths, but not fields of nested object projections (@ObjectProjection within this @ObjectProjection), unless these fields are included explicitly through includePaths.

includeDepth=2 表示包含此对象投影的字段和立即嵌套的对象投影的字段(此 @ObjectProjection 中的 @ObjectProjection ),除非通过 excludePaths 显式排除这些字段,但不包含超出此范围的嵌套对象投影的字段(此 @ObjectProjection 中的 @ObjectProjection 中的 @ObjectProjection ),除非通过 includePaths 显式将这些字段包括在内。

includeDepth=2 means fields of this object projection and fields of immediately nested object projections (@ObjectProjection within this @ObjectProjection) are included, unless these fields are explicitly excluded through excludePaths, but not fields of nested object projections beyond that (@ObjectProjection within an @ObjectProjection within this @ObjectProjection), unless these fields are included explicitly through includePaths.

依此类推。

And so on.

默认值取决于 includePaths 属性的值:

The default value depends on the value of the includePaths attribute:

如果 includePaths 为空,includeDepth 默认为无穷大(包括每个级别的所有字段)。

if includePaths is empty, includeDepth defaults to infinity (include all fields at every level).

如果 includePaths 不为空, includeDepth 默认为 0 (仅包含显式包含的字段)。

if includePaths is not empty, includeDepth defaults to 0 (only include fields included explicitly).

在不同嵌套级别混合使用 includePathsexcludePaths 一般来说,可以在嵌套 @ObjectProjection 的不同级别使用 includePathsexcludePaths 。这样做时请记住,每个级别的筛选器只能引用可达路径,即筛选器不能引用被嵌套 @ObjectProjection (隐式或显式)排出的路径。

Mixing includePaths and excludePaths at different nesting levels In general, it is possible to use includePaths and excludePaths at different levels of nested @ObjectProjection. When doing so, keep in mind that the filter at each level can only reference reachable paths, i.e. a filter cannot reference a path that was excluded by a nested @ObjectProjection (implicitly or explicitly).

下面有三个示例:一个仅利用 includePaths,一个利用 excludePaths,一个利用 includePathsincludeDepth

Below are three examples: one leveraging includePaths only, one leveraging excludePaths, and one leveraging includePaths and includeDepth.

所有三个示例都基于以下映射实体,该实体依赖于 @IndexedEmbedded 及其提供的 very similar filters

All three examples are based on the following mapped entity that rely on @IndexedEmbedded and the very similar filters it provides:

@Entity
@Indexed
public class Human {

    @Id
    private Integer id;

    @FullTextField(analyzer = "name", projectable = Projectable.YES)
    private String name;

    @FullTextField(analyzer = "name", projectable = Projectable.YES)
    private String nickname;

    @ManyToMany
    @IndexedEmbedded(includeDepth = 5, structure = ObjectStructure.NESTED)
    private List<Human> parents = new ArrayList<>();

    @ManyToMany(mappedBy = "parents")
    private List<Human> children = new ArrayList<>();

    public Human() {
    }

    // Getters and setters
    // ...

}
示例 345. 使用 includePaths 筛选嵌套投影

. Example 345. Filtering nested projections with includePaths

@ProjectionConstructor
public record HumanProjection(
        @FieldProjection
        String name,
        @FieldProjection
        String nickname,
        @ObjectProjection(includePaths = { "name", "nickname", "parents.name" })
        List<HumanProjection> parents
) {
}
示例 346. 使用 excludePaths 筛选嵌套投影

. Example 346. Filtering nested projections with excludePaths

@ProjectionConstructor
public record HumanProjection(
        @FieldProjection
        String name,
        @FieldProjection
        String nickname,
        @ObjectProjection(excludePaths = { "parents.nickname", "parents.parents" })
        List<HumanProjection> parents
) {
}
示例 347. 使用 includePathsincludeDepth 筛选嵌套投影

. Example 347. Filtering nested projections with includePaths and includeDepth

@ProjectionConstructor
public record HumanProjection(
        @FieldProjection
        String name,
        @FieldProjection
        String nickname,
        @ObjectProjection(includeDepth = 2, includePaths = { "parents.parents.name" })
        List<HumanProjection> parents
) {
}

15.4.12. constant: return a provided constant

constant 投影针对每个文档返回相同的值,该值在定义投影时提供。

The constant projection returns the same value for every single document, the value being provided when defining the projection.

这仅在某些边缘情况下有用,用户希望在每个单次匹配的表示中包括一些更广泛的语境。在这种情况下, constant 值很可能将与 composite projectionobject projection 一起使用。

This is only useful in some edge cases where one wants to include some broader context in the representation of every single hit. In this case, the constant value will most likely be used together with a composite projection or an object projection.

Syntax
示例 348. 为每个匹配文档返回一个常量值

. Example 348. Returning a constant value for every single matched document

Instant searchRequestTimestamp = Instant.now();
List<MyPair<Integer, Instant>> hits = searchSession.search( Book.class )
        .select( f -> f.composite()
                .from( f.id( Integer.class ), f.constant( searchRequestTimestamp ) )
                .as( MyPair::new ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );
In projections to custom types

没有内置注释可以在 projection to an annotated custom type 内使用 constant 投影。

There is no built-in annotation to use the constant projection inside a projection to an annotated custom type.

您可以在需要时 create your own annotation,由 custom projection binder 备份。

You can create your own annotation if you need one, backed by a custom projection binder.

15.4.13. highlight: return highlighted field values from matched documents

highlight 投影从导致查询匹配的匹配文档的全文本字段中返回片段。

The highlight projection returns fragments from full-text fields of matched documents that caused a query match.

为了使用给定字段的高亮投影,需要在映射中提供 highlighters supported by the field 的列表。

In order to use highlight projections of a given field, you need to provide the list of highlighters supported by the field in the mapping.

在某些情况下, highlightable 默认值可能已经启用了对高亮的支持。有关更多详细信息,请参阅 how the DEFAULT highlightable value behaves

The highlightable default may already enable the support of highlighting in some cases. For more details see how the DEFAULT highlightable value behaves.

Syntax

默认情况下,highlight 投影返回每个高亮字段的字符串值列表,无论该字段是单值还是多值,因为字段值中可以有多个高亮术语,并且根据高亮显示配置,这会导致其中包含多个带高亮术语的文本片段:

The highlight projection, by default, returns a list of string values per highlighted field, no matter if the field is a single or a multivalued one, since there can be multiple highlighted terms in a field value and depending on the highlighter configuration that can result in multiple text fragments with highlighted terms in them:

示例 349. 返回匹配文档的高亮结果

. Example 349. Returning highlights for matched documents

List<List<String>> hits = searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) )
        .where( f -> f.match().field( "title" ).matching( "detective" ) )
        .fetchHits( 20 );
[
    ["The Automatic <em>Detective</em>"], (1)
    ["Dirk Gently's Holistic <em>Detective</em> Agency"], (2)
    [
      "The Paris <em>Detective</em>",
      "<em>Detective</em> Luc Moncrief series"
    ], (3)
]

在某些情况下,当我们知道只返回一个高亮片段时,强制高亮投影生成单个 String 而不是 List<String 可能会很有帮助。仅当 number of fragments 明确设置为 1 时才有可能。

In some scenarios, when we know that there is going to be only one highlighted fragment returned, it may be helpful to force the highlight projection to produce a single String rather than a List<String. This is only possible when the number of fragments is explicitly set to 1.

示例 350. 强制使用单值高亮投影

. Example 350. Forcing a single-valued highlight projection

List<String> hits = searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ).single() ) (1)
        .where( f -> f.match().field( "title" ).matching( "detective" ) )
        .highlighter( f -> f.unified()
                .numberOfFragments( 1 ) ) (2)
        .fetchHits( 20 );
Multivalued fields

多值字段的每个值都将被高亮显示。请参阅 how highlighter can be configured 以调整返回的结果的行为和结构。

Each value of a multivalued field is being highlighted. See how highlighter can be configured to adjust the returned results' behaviour and structure.

目前,高亮 nested object 内的字段是 not supported ,尝试这样做将导致异常。高亮 flattened object 内的字段将正常工作。

At the moment, highlighting a field within a nested object is not supported and attempting to do so will lead to an exception. Highlighting a field within a flattened object will work correctly.

不支持在 object projection 内放置高亮投影。

Placing a highlight projection inside an object projection is not supported.

示例 351. 返回匹配文档中扁平化对象的高亮结果

. Example 351. Returning highlights for flattened objects from matched documents

List<List<String>> hits = searchSession.search( Book.class )
        .select( f -> f.highlight( "flattenedAuthors.lastName" ) )
        .where( f -> f.match().field( "flattenedAuthors.lastName" ).matching( "martinez" ) )
        .fetchHits( 20 );
Highlighting options

可以通过高亮器选项进行微调以改变高亮的输出。

A highlighter can be fine-tuned through its options to change the highlighted output.

示例 352. 配置默认高亮工具

. Example 352. Configuring the default highlighter

List<List<String>> hits = searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) ) (1)
        .where( f -> f.match().field( "title" ).matching( "detective" ) )
        .highlighter( f -> f.unified().tag( "<b>", "</b>" ) ) (2)
        .fetchHits( 20 );

此外,如果高亮了多个字段,并且它们需要不同的高亮器选项,则可以使用命名的高亮器来覆盖默认高亮器。

Additionally, if multiple fields are highlighted, and they need different highlighter options then a named highlighter can be used to override the default one.

示例 353. 配置默认高亮工具和命名高亮工具

. Example 353. Configuring default and named highlighters

List<List<?>> hits = searchSession.search( Book.class )
        .select( f -> f.composite().from(
                f.highlight( "title" ),
                f.highlight( "description" ).highlighter( "description-highlighter" ) (1)
        ).asList() )
        .where( f -> f.match().field( "title" ).matching( "detective" ) )
        .highlighter( f -> f.unified().tag( "<b>", "</b>" ) ) (2)
        .highlighter(
                "description-highlighter",
                f -> f.unified().tag( "<span>", "</span>" )
        ) (3)
        .fetchHits( 20 );

有关高亮显示器配置的更多信息,请参阅 Highlight DSL

See Highlight DSL for more information on highlighter configuration.

@HighlightProjection in projections to custom types

要在 projection to an annotated custom type 内实现 highlight 投影,请在构造器参数上使用 @HighlightProjection 注释:

To achieve a highlight projection inside a projection to an annotated custom type, use the @HighlightProjection annotation on the constructor parameter:

示例 354. 在投影构造函数中返回匹配文档的高亮结果。多个高亮片段

. Example 354. Returning highlights for matched documents within a projection constructor. Multiple highlighted fragments

@ProjectionConstructor (1)
public record MyBookTitleAndHighlightedDescriptionProjection(
        @HighlightProjection (2)
        List<String> description, (3)
        String title (4)
) {
}
List<MyBookTitleAndHighlightedDescriptionProjection> hits = searchSession.search( Book.class )
        .select( MyBookTitleAndHighlightedDescriptionProjection.class )(1)
        .where( f -> f.match().field( "description" ).matching( "self-aware" ) )
        .fetchHits( 20 ); (2)
示例 355. 在投影构造函数中返回匹配文档的高亮结果。单个高亮片段

. Example 355. Returning highlights for matched documents within a projection constructor. Single highlighted fragment

@ProjectionConstructor (1)
public record MyBookHighlightedTitleProjection(
        @HighlightProjection (2)
        String title, (3)
        String description
) {
}
List<MyBookHighlightedTitleProjection> hits = searchSession.search( Book.class )
        .select( MyBookHighlightedTitleProjection.class )(1)
        .where( f -> f.match().field( "title" ).matching( "robot" ) )
        .highlighter( f -> f.unified().numberOfFragments( 1 ) ) (2)
        .fetchHits( 20 ); (3)

该注释公开以下特性:

The annotation exposes the following attributes:

path

突出显示字段的路径。

The path to the highlighted field.

如果未设置,它将从构造函数参数名称中推断。

If not set, it is inferred from the constructor parameter name.

highlighter

查询中配置的高亮程序的名称;对于每个高亮投影,都对于 apply different options 有用。

The name of a highlighter configured in the query; useful to apply different options to each highlight projection.

如果未设置,则投影将使用查询中配置的默认高亮显示功能。

If not set, the projection will use the default highlighter as configured in the query.

对于 programmatic mapping,使用_HighlightProjectionBinder.create()_。

For programmatic mapping, use HighlightProjectionBinder.create().

示例 356. 在投影构造函数中对 highlight 投影进行编程映射

. Example 356. Programmatic mapping of a highlight projection within a projection constructor

TypeMappingStep myBookIdAndHighlightedTitleProjection =
        mapping.type( MyBookTitleAndHighlightedDescriptionProjection.class );
myBookIdAndHighlightedTitleProjection.mainConstructor()
        .projectionConstructor();
myBookIdAndHighlightedTitleProjection.mainConstructor().parameter( 0 )
        .projection( HighlightProjectionBinder.create() );
Highlight limitations

目前,Hibernate Search 对于 highlight projections 可以包括的位置有以下限制,并且在这些情况下尝试应用高亮投影会导致抛出异常,具体如下:

For now, Hibernate Search has limitations on where highlight projections can be included, and trying to apply highlight projections in these scenarios will lead to an exception being thrown, in particular:

  1. Such projection cannot be a part of an object projection.

示例 357. 在 .object(..) 投影_List<List<?>> hits = searchSession.search( Book.class ) .select( f → f.object( "authors" ) .from( f.highlight( "authors.firstName" ), f.highlight( "authors.lastName" ) ).asList() ) .where( f → f.match().field( "authors.firstName" ).matching( "Art*" ) ) .fetchHits( 20 );_中非法使用 .highlight(..) 投影将导致异常。

Example 357. Illegal use of .highlight(..) projection within an .object(..) projection_List<List<?>> hits = searchSession.search( Book.class ) .select( f → f.object( "authors" ) .from( f.highlight( "authors.firstName" ), f.highlight( "authors.lastName" ) ).asList() ) .where( f → f.match().field( "authors.firstName" ).matching( "Art*" ) ) .fetchHits( 20 );_Doing so will lead to an exception.

  1. Fields of an object with a nested structure cannot be highlighted under any circumstances.

示例 358. 在 .object(..) 投影_List<?> hits = searchSession.search( Book.class ) .select( f → f.highlight( "authors.firstName" ) ) .where( f → f.match().field( "authors.firstName" ).matching( "Art*" ) ) .fetchHits( 20 );_中非法使用 .highlight(..) 投影假设 authors 映射为 nested 结构,例如:

Example 358. Illegal use of .highlight(..) projection within an .object(..) projection_List<?> hits = searchSession.search( Book.class ) .select( f → f.highlight( "authors.firstName" ) ) .where( f → f.match().field( "authors.firstName" ).matching( "Art*" ) ) .fetchHits( 20 );_Assuming that authors are mapped as nested structure, e.g.:

_@IndexedEmbedded(structure = ObjectStructure.NESTED)private List<Author> authors = new ArrayList<>();_尝试应用此类投影将导致抛出异常。

_@IndexedEmbedded(structure = ObjectStructure.NESTED) private List<Author> authors = new ArrayList<>();_Trying to apply such projection will result in an exception being thrown.

这些限制应通过 HSEARCH-4841 解决。

These limitations should be addressed by HSEARCH-4841.

15.4.14. withParameters: create projections using query parameters

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

withParameters 投影允许使用 query parameters 构建投影。

The withParameters projection allows building projections using query parameters.

此类投影需要一个接受查询参数并返回投影的函数。该函数将在查询构建时被调用。

This type of projection requires a function that accepts query parameters and returns a projection. That function will get called at query building time.

Syntax

withParameters 投影返回类型取决于 .withParameters(..) 中配置的投影类型:

The withParameters projection return type depends on the projection type configured within the .withParameters(..):

示例 359. 使用查询参数创建投影

. Example 359. Creating a projection with query parameters

GeoPoint center = GeoPoint.of( 47.506060, 2.473916 );
SearchResult<Double> result = searchSession.search( Author.class )
        .select( f -> f.withParameters( params -> f (1)
                .distance( "placeOfBirth", params.get( "center", GeoPoint.class ) ) ) ) (2)
        .where( f -> f.matchAll() )
        .param( "center", center ) (3)
        .fetch( 20 );

15.4.15. Backend-specific extensions

通过在构建查询时调用 .extension(…​),可以访问特定于后端的投影。

By calling .extension(…​) while building a query, it is possible to access backend-specific projections.

顾名思义,特定于后端的投影不能从一种后端技术移植到另一种技术。

As their name suggests, backend-specific projections are not portable from one backend technology to the other.

Lucene: document

.document() 投影将匹配的文档作为本机 Lucene Document 返回。

The .document() projection returns the matched document as a native Lucene Document.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Lucene: documentTree

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

.documentTree() 投影将匹配的文档作为包含本机 Lucene Document 及相应嵌套树节点的树返回。

The .documentTree() projection returns the matched document as a tree containing native Lucene Document and corresponding nested tree nodes.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Lucene: explanation

.explanation() 投影将匹配作为本机 Lucene Explanationexplanation 返回。

The .explanation() projection returns an explanation of the match as a native Lucene Explanation.

无论使用哪种 API,解释在性能方面相当昂贵:仅将其用于调试目的。

Regardless of the API used, explanations are rather costly performance-wise: only use them for debugging purposes.

此特性意味着应用程序代码直接依赖 Lucene API。

This feature implies that application code rely on Lucene APIs directly.

即使是针对 bug 修复(微)版本,升级 Hibernate Search 也可能需要升级 Lucene,这可能会导致 Lucene 中中断 API 更改。

An upgrade of Hibernate Search, even for a bugfix (micro) release, may require an upgrade of Lucene, which may lead to breaking API changes in Lucene.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Elasticsearch: source

.source() 投影将文档在 Elasticsearch 中的索引 JSON 作为一个 JsonObject 返回。

The .source() projection returns the JSON of the document as it was indexed in Elasticsearch, as a JsonObject.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Elasticsearch: explanation

.explanation() 投影将匹配作为 JsonObjectexplanation 返回。

The .explanation() projection returns an explanation of the match as a JsonObject.

无论使用哪种 API,解释在性能方面相当昂贵:仅将其用于调试目的。

Regardless of the API used, explanations are rather costly performance-wise: only use them for debugging purposes.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Elasticsearch: jsonHit

.jsonHit() 投影将 Elasticsearch 对命中的确切 JSON 作为一个 JsonObject 返回。

The .jsonHit() projection returns the exact JSON returned by Elasticsearch for the hit, as a JsonObject.

这在 customizing the request’s JSON 中要求在每次命中中提供附加数据时特别有用。

This is particularly useful when customizing the request’s JSON to ask for additional data within each hit.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

15.5. Highlight DSL

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

15.5.1. Basics

高亮是一个投影,它返回会触发查询匹配的匹配文档的全文字段片段。导致匹配的特定术语会使用一对开始和结尾标签“高亮”显示。这可以帮助用户在结果页面上快速识别他们搜索的信息。

Highlighting is a projection that returns fragments from full-text fields of matched documents that caused a query match. Specific terms that caused the match are "highlighted" with a pair of opening and closing tags. It can help a user to quickly identify the information they were searching for on a results page.

高亮投影仅适用于具有允许其的属性配置的 full-text fields

Highlight projections are only available for full-text fields with an attribute configuration that allows it:

示例 366. 为突出显示配置字段

. Example 366. Configuring fields for highlighting

@Entity(name = Book.NAME)
@Indexed
public class Book {

    public static final String NAME = "Book";

    @Id
    private Integer id;

    @FullTextField(analyzer = "english") (1)
    private String author;

    @FullTextField(analyzer = "english",
            highlightable = { Highlightable.PLAIN, Highlightable.UNIFIED }) (2)
    private String title;

    @FullTextField(analyzer = "english",
            highlightable = Highlightable.ANY) (3)
    @Column(length = 10000)
    private String description;

    @FullTextField(analyzer = "english",
            projectable = Projectable.YES,
            termVector = TermVector.WITH_POSITIONS_OFFSETS) (4)
    @Column(length = 10000)
    @ElementCollection
    private List<String> text;

    @GenericField (5)
    @Column(length = 10000)
    @ElementCollection
    private List<String> keywords;


}
示例 367. 使用高亮投影

. Example 367. Using a highlight projection

SearchSession searchSession = /* ... */ (1)

List<List<String>> result = searchSession.search( Book.class ) (2)
        .select( f -> f.highlight( "title" ) ) (3)
        .where( f -> f.match().field( "title" ).matching( "mystery" ) ) (4)
        .fetchHits( 20 ); (5)
[
    ["The Boscombe Valley <em>Mystery</em>"], (1)
    [
      "A Caribbean <em>Mystery</em>",
      "Miss Marple: A Caribbean <em>Mystery</em> by Agatha Christie"
    ], (2)
    ["A <em>Mystery</em> of <em>Mysteries</em>: The Death and Life of Edgar Allan Poe"] (3)
]

高亮投影与 field projections 一样,也可以与其他投影类型以及其他高亮投影组合使用:

Highlight projections, just like field projections, can also be used in a combination with other projection types as well as with other highlight projections:

示例 368. 使用复合高亮投影

. Example 368. Using a composite highlight projection

List<List<?>> result = searchSession.search( Book.class ) (1)
        .select( f -> f.composite().from(
                f.id(), (2)
                f.field( "title", String.class ), (3)
                f.highlight( "description" ) (4)
        ).asList() )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) ) (5)
        .fetchHits( 20 ); (6)

可以配置高亮器行为。请参见各种可用的 configuration options。高亮器定义在查询的 where 子句之后提供:

A highlighter behavior can be configured. See various available configuration options. A highlighter definition is provided after a where clause of a query:

示例 369. 配置默认高亮程序

. Example 369. Configuring a default highlighter

List<List<?>> result = searchSession.search( Book.class )
        .select( f -> f.composite().from(
                f.highlight( "title" ),
                f.highlight( "description" )
        ).asList() )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) ) (1)
        .highlighter( f -> f.plain().noMatchSize( 100 ) ) (2)
        .fetchHits( 20 ); (3)

15.5.2. Highlighter type

在配置高亮器之前,您需要选择其类型。选择高亮器类型是高亮器定义中的第一步:

Before a highlighter can be configured, you need to pick its type. Picking the highlighter type is the first step in a highlighter definition:

示例 370. 指定纯高亮程序类型

. Example 370. Specifying the plain highlighter type

searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) )
        .highlighter( f -> f.plain() /* ... */ ) (1)
        .fetchHits( 20 );
示例 371. 指定统一高亮程序类型

. Example 371. Specifying the unified highlighter type

searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) )
        .highlighter( f -> f.unified() /* ... */ ) (1)
        .fetchHits( 20 );
示例 372. 指定快速向量高亮程序类型

. Example 372. Specifying the fast vector highlighter type

searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) )
        .highlighter( f -> f.fastVector() /* ... */ ) (1)
        .fetchHits( 20 );

在高亮器类型方面有三个选项可供选择:

There are three options to choose from when it comes to the highlighter type:

Plain

对于针对少数文档中的单一字段的简单查询,纯高亮程序可能很有用。此高亮程序使用标准的 Lucene 高亮程序。它读取突出显示字段的字符串值,然后从中创建一个小型内存索引,并应用查询逻辑来执行突出显示。

The plain highlighter can be useful for simple queries targeting a single field on a small number of documents. This highlighter uses a standard Lucene highlighter. It reads the string value of a highlighted field, then creates a small in-memory index from it and applies query logic to perform the highlighting.

Unified

默认使用统一高亮程序,它不一定依赖于重新分析文本,因为它可以从发布或词向量中获取偏移量。

The unified highlighter is used by default and does not necessarily rely on re-analyzing the text, as it can get the offsets either from postings or from term vectors.

此高亮器使用断点迭代器(默认情况下将文本分解成句子)将文本分解成稍后评分的段落。它能更好地支持更复杂的查询。由于它可以处理预构建的数据,因此与普通高亮器相比,在文档量较大的情况下它具有更好的性能。

This highlighter uses a break iterator (breaks the text into sentences by default) to break the text into later scored passages. It better supports more complex queries. Since it can work with prebuilt data, it performs better in case of a larger amount of documents compared to the plain highlighter.

Fast vector

除了使用与统一高亮程序类似的分隔迭代器外,快速向量高亮程序还可以使用边界字符来控制突出显示的摘录。

The fast vector highlighter, in addition to using a break iterator similar to the unified highlighter, it can use the boundary characters to control the highlighted snippet.

这是唯一可以为高亮片段分配不同权重的荧光笔,它可以通过用不同的标签包装它来显示片段分数差异。有关标签的更多信息,请参见 the corresponding section

This is the only highlighter that can assign different weights to highlighted fragments, allowing it to show a fragment score differences by wrapping it with a different tag. For more on tags, see the corresponding section.

快速矢量高亮器也可以高亮整个匹配的短语。在其他高亮器类型中使用 phrase predicates 将导致一个短语中的每个单词被单独高亮。

The fast vector highlighter is also the one which can highlight entire matched phrases. Using phrase predicates with other highlighter types will lead to each word in a phrase being highlighted separately.

15.5.3. Named highlighters

有时我们可能希望对各个字段应用不同的高亮器。我们已经看到了 highlighter can be configured。来自该示例的高亮器称为默认高亮器。搜索查询还允许配置命名的高亮器。命名的高亮器具有与默认高亮器相同的配置功能。如果配置了默认高亮器,它会覆盖默认高亮器设置的选项。如果为查询配置了默认高亮器,则在同一查询上配置的每个命名高亮器必须与默认高亮器属于同一类型。仅当未配置默认高亮器时,才允许在同一查询中混合使用各种高亮器类型。

Sometimes we might want to apply different highlighters to various fields. We have already seen that a highlighter can be configured. The highlighter from that example is called the default highlighter. Search queries also allow to configure named highlighters. A named highlighter has the same configuration capabilities as the default one. It overrides the options set by the default highlighter if such was configured. If a default highlighter was configured for a query then every named highlighter configured on the same query must be of the same type as the default one. Mixing various highlighter types within the same query is only allowed when no default highlighter was configured.

当高亮投影将一个命名的高亮器传递给链接为 highlight projection definition 一部分的可选 highlighter(..) 调用时,该特定高亮器将应用于字段投影。命名的高亮器可以在查询中重复使用,即,一个命名的高亮器的相同名称可以传递到多个高亮投影。

When a highlight projection has a named highlighter passed to an optional highlighter(..) call chained as a part of the highlight projection definition, then that particular highlighter will be applied to a field projection. Named highlighters can be reused withing a query, i.e. the same name of a named highlighter can be passed to multiple highlight projections.

示例 373. 配置默认高亮和命名高亮

. Example 373. Configuring both default and named highlighters

List<List<?>> result = searchSession.search( Book.class )
        .select( f -> f.composite().from(
                f.highlight( "title" ), (1)
                f.highlight( "description" ).highlighter( "customized-plain-highlighter" ) (2)
        ).asList() )
        .where( f -> f.match().fields( "title", "description" ).matching( "scandal" ) )
        .highlighter( f -> f.plain().tag( "<b>", "</b>" ) ) (3)
        .highlighter( "customized-plain-highlighter", f -> f.plain().noMatchSize( 100 ) ) (4)
        .fetchHits( 20 ); (5)

命名高亮标记器的名称不能是 null 或空字符串。如果使用了此类值,将会抛出异常。

The name of a named highlighter cannot be null or an empty string. An exception will be thrown if such values are used.

15.5.4. Tags

默认情况下,高亮文本用一对 <em>/</em> 标签包裹。可以提供自定义标签对来更改此行为。通常,标签是一对 HTML 标签,但它们可以是一对任何字符序列。

By default, the highlighted text is wrapped with a pair of <em>/</em> tags. A custom pair of tags can be provided to change this behaviour. Usually, tags are a pair of HTML tags, but they can be a pair of any character sequences.

示例 374. 设置自定义标签

. Example 374. Setting custom tags

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) )
        .where( f -> f.match().fields( "title" ).matching( "scandal" ) )
        .highlighter( f -> f.unified().tag( "<strong>", "</strong>" ) ) (1)
        .fetchHits( 20 );

可以处理多个标签的快速矢量高亮器有几个接受标签集合的附加方法。

The fast vector highlighter, which can handle multiple tags, has a few additional methods that accept a collection of tags.

示例 375. 设置多个自定义标签

. Example 375. Setting multiple custom tags

result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "scandal" ) )
        .highlighter( f -> f.fastVector()
                .tags( (1)
                        Arrays.asList( "<em class=\"class1\">", "<em class=\"class2\">" ),
                        "</em>"
                ) )
        .fetchHits( 20 );
result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "scandal" ) )
        .highlighter( f -> f.fastVector()
                .tags( (2)
                        Arrays.asList( "<em>", "<strong>" ),
                        Arrays.asList( "</em>", "</strong>" )
                ) )
        .fetchHits( 20 );

此外,快速矢量高亮器可以选择启用标签模式并将其设置为 HighlighterTagSchema.STYLED 以使用预定义的标签集。

Additionally, a fast vector highlighter has the option to enable a tag schema and set it to HighlighterTagSchema.STYLED to use a predefined set of tags.

示例 376. 设置样式标记架构

. Example 376. Setting a styled tags schema

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "scandal" ) )
        .highlighter( f -> f.fastVector()
                .tagSchema( HighlighterTagSchema.STYLED ) (1)
        )
        .fetchHits( 20 );

使用样式化标签模式只是将标签定义为:

Using a styled tags schema is just a shortcut to defining tags as:

示例 377. 设置标签,就像在使用样式标记架构一样

. Example 377. Setting tags as if the styled tags schema is used

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "scandal" ) )
        .highlighter( f -> f.fastVector()
                .tags( Arrays.asList(
                        "<em class=\"hlt1\">",
                        "<em class=\"hlt2\">",
                        "<em class=\"hlt3\">",
                        "<em class=\"hlt4\">",
                        "<em class=\"hlt5\">",
                        "<em class=\"hlt6\">",
                        "<em class=\"hlt7\">",
                        "<em class=\"hlt8\">",
                        "<em class=\"hlt9\">",
                        "<em class=\"hlt10\">"
                ), "</em>" ) (1)
        )
        .fetchHits( 20 );

在同个高亮定义中,调用不同的标签配置方法 ( tag(..) / tags(..) / tagSchema(.. ) 或相同的方法多次,将不会进行合并。将应用最后一次呼叫设置的标签。

Calling different tags configuration methods (tag(..)/tags(..)/tagSchema(..) or the same one multiple times within the same highlighter definition will not combine them. Tags set by the last call will be applied.

15.5.5. Encoder

对存储 HTML 的字段进行高亮时,可以将编码应用于高亮的片段。将 HTML 编码器应用于高亮器会对文本进行编码以将其包含在 HTML 文档中:它会用其实体等效文件(如 <)替换 HTML 元字符,如 <;但是它不会转义高亮标签。默认情况下,使用 HighlighterEncoder.DEFAULT 编码器,它保持文本原样。

Encoding can be applied to the highlighted snippets when highlighting the fields that store HTML. Applying an HTML encoder to a highlighter will encode the text for inclusion into an HTML document: it will replace HTML meta-characters such as < with their entity equivalent such as <; however it will not escape the highlighting tags. By default, a HighlighterEncoder.DEFAULT encoder is used, which keeps the text as is.

示例 378. 设置 HTML 编码器

. Example 378. Setting the HTML encoder

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "title" ) )
        .where( f -> f.match().fields( "title" ).matching( "scandal" ) )
        .highlighter( f -> f.unified().encoder( HighlighterEncoder.HTML ) ) (1)
        .fetchHits( 20 );

15.5.6. No match size

遇到更复杂的查询或为多个字段执行高亮时,可能会导致查询与文档匹配的情况,但是特定高亮字段未对此匹配做出贡献。这将导致针对特定文档和字段的高亮显示为空列表。没有匹配大小选项,即使字段未协助匹配文档,并且没有内容可供高亮显示,您仍可以获得一些返回的文本。

In case of more complex queries or when highlighting is performed for multiple fields, it might lead to a situation where the query matched a document, but a particular highlighted field did not contribute to that match. This will lead to an empty list of highlights for that particular document and that field. No match size option allows you to still get some text returned even if the field didn’t contribute to the document match, and there’s nothing to be highlighted in it.

此属性设置的数字定义了从字段开头开始要包含的字数字符。根据高亮显示类型,返回的文本数量可能不完全匹配配置的值,因为高亮显示器通常会尝试不打断单词/句子中间的文本,这取决于其配置。默认情况下,此选项设置为 0,并且仅在有需要高亮显示的内容时才会返回文本。

The number set by this property defines the number of characters to be included starting at the beginning of a field. Depending on the highlighter type, the amount of text returned might not precisely match the configured value since highlighters usually try not to break the text in the middle of a word/sentence, depending on their configuration. By default, this option is set to 0 and text will only be returned if there’s something to highlight.

示例 379. 设置 no match size

. Example 379. Setting the no match size

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.bool()
                .must( f.match().fields( "title" ).matching( "scandal" ) ) (1)
                .should( f.match().fields( "description" ).matching( "scandal" ) ) (2)
        )
        .highlighter( f -> f.fastVector().noMatchSize( 100 ) ) (3)
        .fetchHits( 20 );

来自 Lucene backend 的统一高亮显示器对此选项支持有限。它不能限制返回文本的数量,并且更像是一个启用/禁用该功能的布尔标志。如果此类型的突出显示器的选项未设置或设置为 0,则在未找到匹配项时不会返回任何文本。否则,如果对此类型的高亮显示器的选项设置了正整数,则将返回所有文本,无论实际值如何。

The unified highlighter from the Lucene backend has a limited support for this option. It cannot limit the amount of the returned text, and works more like a boolean flag to enable/disable the feature. If a highlighter of this type has the option not set or set to 0 then no text is returned when there was no match found. Otherwise, if the option for a highlighter of this type was set to a positive integer, all text is returned, no matter the actual value.

15.5.7. Fragment size and number of fragments

片段大小设置包含在每个高亮片段中的文本数量,默认是 100 个字符。

The fragment size sets the amount of text included in each highlighted fragment, by default 100 characters.

这不是一个“硬”限制,因为高亮显示器通常尝试不要在单词中间断开片段。此外,其他功能,例如 boundary scanning,可能会导致在片段前后包含更多文本。

This is not a "hard" limit, since highlighters usually try not to break the fragment in the middle of a word. Additionally, other features such as boundary scanning may lead to more text before and after the fragment being included as well.

多个片段配置集设置结果高亮列表中包含的最大字符串数。默认情况下,片段数限制为 5

A number of fragments configuration sets the maximum number of strings included in the resulting highlighted list. By default, the number of fragments is limited to 5.

使用大文本字段高亮显示时,这些选项的组合可能很有用。

A combination of these options can be helpful when highlighting large text fields.

示例 380. 设置片段大小和片段数

. Example 380. Setting the fragment size and the number of fragments

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "king" )
        )
        .highlighter( f -> f.fastVector()
                .fragmentSize( 50 ) (1)
                .numberOfFragments( 2 ) (2)
        )
        .fetchHits( 20 );

所有高亮标记器类型都支持 Elasticsearch 后端的这些选项。对于 Lucene 后端——片断数同样受所有高亮标记器类型支持,而只有普通和快速矢量高亮标记器支持片断大小。

These options are supported by all highlighter types on the Elasticsearch backend. As for the Lucene backend — the number of fragments is also supported by all highlighter types, while only plain and fast-vector highlighters support fragment size.

15.5.8. Order

默认情况下,高亮片段按文本中出现的顺序返回。通过启用按得分排序选项,最相关的片段将返回在列表顶部。

By default, highlighted fragments are returned in the order of occurrence in the text. By enabling the order by score option most relevant fragments will be returned at the top of the list.

示例 381. 设置片段大小和片段数

. Example 381. Setting the fragment size and number of fragments

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.bool() (1)
                .should( f.match().fields( "description" ).matching( "king" ) )
                .should( f.match().fields( "description" ).matching( "souvenir" ).boost( 10.0f ) )
        )
        .highlighter( f -> f.fastVector().orderByScore( true ) ) (2)
        .fetchHits( 20 );

15.5.9. Fragmenter

默认情况下,普通高亮显示器将文本分成大小相同的片段,但会尝试避免打断要高亮的句子。这是 HighlighterFragmenter.SPAN 分段器的行为。或者,可以将分段器设置为 HighlighterFragmenter.SIMPLE,它仅将文本分成大小相同的段。

By default, the plain highlighter breaks up text into same-sized fragments but tries to avoid breaking up a phrase to be highlighted. This is the behaviour of the HighlighterFragmenter.SPAN fragmenter. Alternatively, fragmenter can be set to HighlighterFragmenter.SIMPLE that simply breaks up the text into same-sized fragments.

示例 382. 设置片段分配器

. Example 382. Setting the fragmenter

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "souvenir" ) )
        .highlighter( f -> f.plain().fragmenter( HighlighterFragmenter.SIMPLE ) ) (1)
        .fetchHits( 20 );

此选项仅受普通高亮标记器支持。

This option is supported only by the plain highlighter.

15.5.10. Boundary scanner

统一和快速向量高亮显示器使用边界扫描仪创建高亮显示的片段:它们尝试通过在这些片段前后扫描文本以查找单词/句子的边界来扩展高亮显示的片段。

Unified and fast vector highlighters use boundary scanners to create highlighted fragments: they try to expand highlighted fragments by scanning text before and after those fragments for word/sentence boundaries.

可选的语言环境参数可以用来指定如何搜索句子和单词的边界。句子边界扫描仪是统一高亮显示器的默认选项。

An optional locale parameter can be supplied to specify how to search for sentence and word boundaries. A sentence boundary scanner is a default option for the unified highlighter.

有两种方法可以将边界扫描仪配置提供给高亮显示器。

There are two ways to supply boundary scanner configuration to a highlighter.

示例 383. 使用 DSL 设置边界扫描仪

. Example 383. Setting the boundary scanner with DSL

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "king" ) )
        .highlighter( f -> f.fastVector()
                .boundaryScanner() (1)
                        .word() (2)
                        .locale( Locale.ENGLISH ) (3)
                        .end() (4)
                /* ... */ (5)
        )
        .fetchHits( 20 );
示例 384. 使用 lambda 设置边界扫描仪

. Example 384. Setting the boundary scanner using lambda

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "king" ) )
        .highlighter( f -> f.fastVector()
                .boundaryScanner(
                        bs -> bs.word() (1)
                )
                /* ... */ (2)
        )
        .fetchHits( 20 );

或者,快速向量高亮显示器可以使用字符边界扫描仪,该扫描仪依赖于其他两个配置, 即边界字符和边界最大扫描。当使用字符边界扫描仪时,形成带中心高亮文本的高亮片段之后,高亮显示器会检查当前片段左右第一个出现的任何已配置边界字符。此查找仅针对边界最大扫描选项配置的最大字符数执行。如果未找到边界字符,除了已高亮词组以外,不会包含其他文本,而周围文本则基于为高亮显示器设置的片段大小选项。

Alternatively, a fast vector highlighter can use a character boundary scanner which relies on two other configurations — boundary characters and boundary max scan. When a character boundary scanner is used after a highlighted fragment is formed with highlighted text centred, the highlighter checks for the first occurrence of any configured boundary characters to the left and right of a currently created fragment. This lookup happens only for a maximum number of characters configured by the boundary max scan option. If no boundary characters are found, no additional text will be included besides the already highlighted phrase with surrounding text based on a fragment size option set for a highlighter.

默认的边界字符列表包括 .,!? \t\n。默认边界最大扫描等于 20 个字符。

The default list of boundary characters includes .,!? \t\n. The default boundary max scan is equal to 20 characters.

字符边界扫描仪是快速向量高亮显示器的默认选项。

Character boundary scanner is a default option for the fast vector highlighters.

示例 385. 设置字符边界扫描仪

. Example 385. Setting the character boundary scanner

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "scene" ) )
        .highlighter( f -> f.fastVector()
                .boundaryScanner() (1)
                        .chars() (2)
                        .boundaryChars( "\n" ) (3)
                        .boundaryMaxScan( 1000 ) (4)
                        .end() (5)
                /* ... */ (6)
        )
        .fetchHits( 20 );

此选项受统一和快速矢量高亮标记器类型支持。

This option is supported by the unified and fast vector highlighter types.

15.5.11. Phrase limit

短语限制允许指定文档中用于高亮的匹配短语的最大数量。高亮显示器将遍历文本,并在达到高亮短语的最大数量时停止,将任何其他出现留在未高亮显示的状态。

Phrase limit allows specifying the maximum number of matching phrases in a document for highlighting. The highlighter will be going through the text and as soon as it reaches the maximum number of highlighted phrases it’ll stop leaving any further occurrences as not highlighted.

此限制与 maximum number of fragments 不同:

This limit is different from the maximum number of fragments:

片段是由高亮投影返回的字符串,而短语是每个片段中高亮(匹配)术语的序列。一个片段可能包括多个高亮短语,一个给定的短语可能出现在多个片段中。

Fragments are the strings returned by the highlight projections, while phrases are the sequences of highlighted (matching) terms in each fragment. A fragment may include multiple highlighted phrases, and a given phrase may appear in multiple fragments.

短语限制限制的是匹配短语的出现的高亮,无论是同一短语的多次出现,还是不同短语的混合。例如,如果我们要在句子 The quick brown fox jumps over the lazy dog 中搜索 foxdog 的出现,并将短语限制设置为 1,那么我们只会高亮 fox,因为它在文本中是第一个匹配,并且达到了短语限制。

The phrase limit is about limiting the highlighting of occurrences of matched phrases, be it multiple occurrences of the same phrase or a mix of different phrases. For example, if we were to search for fox and dog occurrences in the sentence The quick brown fox jumps over the lazy dog and set the phrase limit to 1, then we’ll have only fox being highlighted since it was the first match in the text and the phrase limit was reached.

默认情况下,此短语限制等于 256

By default, this phrase limit is equal to 256.

如果字段包含许多匹配项并且总体上有很多文本,但我们不希望将每一次出现都高亮显示时,此选项会很有用。

This option can be helpful if a field contains many matches and has a lot of text overall, but we are not interested in highlighting every occurrence.

示例 386. 设置短语限制

. Example 386. Setting the phrase limit

List<List<String>> result = searchSession.search( Book.class )
        .select( f -> f.highlight( "description" ) )
        .where( f -> f.match().fields( "description" ).matching( "bank" ) )
        .highlighter( f -> f.fastVector()
                .phraseLimit( 1 ) (1)
        )
        .fetchHits( 20 );

此选项仅受快速矢量高亮标记器类型支持。

This option is supported only by the fast vector highlighter type.

15.6. Aggregation DSL

15.6.1. Basics

有时,您不仅仅需要直接列出查询结果:您还需要对结果进行分组和汇总。

Sometimes, you don’t just need to list query hits directly: you also need to group and aggregate the hits.

例如,您访问的几乎所有电子商务网站都采用某种“分面”,这是一种简单的聚合形式。在线书店的“图书搜索”网页上,在匹配的图书列表旁边,你会找到“分面”,即各个类别中匹配的文档数。这些类别可以直接从索引数据中获取,例如图书的类型(科幻小说、犯罪小说),也可以从索引的数据中稍微衍生而来,例如价格范围(“低于 5 美元”、“低于 10 美元”)。

For example, almost any e-commerce website you can visit will have some sort of "faceting", which is a simple form of aggregation. In the "book search" webpage of an online bookshop, beside the list of matching books, you will find "facets", i.e. a count of matching documents in various categories. These categories can be taken directly from the indexed data, e.g. the genre of the book (science-fiction, crime fiction, …​), but also derived from the indexed data slightly, e.g. a price range ("less than $5", "less than $10", …​).

聚合不仅允许这样做(而且,取决于后端,还可以做更多):它们允许查询返回“聚合”结果。

Aggregations allow just that (and, depending on the backend, much more): they allow the query to return "aggregated" hits.

Aggregations can be configured when building the search query:

示例 387. 在搜索查询中定义聚合

. Example 387. Defining an aggregation in a search query

SearchSession searchSession = /* ... */ (1)

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" ); (2)

SearchResult<Book> result = searchSession.search( Book.class ) (3)
        .where( f -> f.match().field( "title" ) (4)
                .matching( "robot" ) )
        .aggregation( countsByGenreKey, f -> f.terms() (5)
                .field( "genre", Genre.class ) )
        .fetch( 20 ); (6)

Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey ); (7)

或者,如果您不想使用 lambdas:

Alternatively, if you don’t want to use lambdas:

示例 388. 在搜索查询中定义聚合——基于对象的语法

. Example 388. Defining an aggregation in a search query — object-based syntax

SearchSession searchSession = /* ... */

SearchScope<Book> scope = searchSession.scope( Book.class );

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );

SearchResult<Book> result = searchSession.search( scope )
        .where( scope.predicate().match().field( "title" )
                .matching( "robot" )
                .toPredicate() )
        .aggregation( countsByGenreKey, scope.aggregation().terms()
                .field( "genre", Genre.class )
                .toAggregation() )
        .fetch( 20 );

Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );

为了使用基于给定字段值的聚合,你需要在映射中将该字段标记为 aggregable

In order to use aggregations based on the value of a given field, you need to mark the field as aggregable in the mapping.

对于全文文本字段,特别是,这不可能;请参阅 here 了解解释和一些解决方案。

This is not possible for full-text fields, in particular; see here for an explanation and some solutions.

分面通常涉及“向下钻取”概念,即能够选择一个分面,并将命中限制为仅匹配该分面的那些命中。

Faceting generally involves a concept of "drill-down", i.e. the ability to select a facet and restrict the hits to only those that match that facet.

Hibernate Search 5 用于提供一个专用 API 以启用此“向下钻取”,但在 Hibernate Search 6 中,您应该只使用适当的 predicate 创建一个新查询。

Hibernate Search 5 used to offer a dedicated API to enable this "drill-down", but in Hibernate Search 6 you should simply create a new query with the appropriate predicate.

聚合 DSL 提供了更多聚合类型,以及每种聚合类型的多个选项。若要了解有关 terms 聚合和所有其他聚合类型更多信息,请参考以下部分。

The aggregation DSL offers more aggregation types, and multiple options for each type of aggregation. To learn more about the terms aggregation, and all the other types of aggregations, refer to the following sections.

15.6.2. terms: group by the value of a field

terms 聚合返回给定字段的每个术语值的文档计数。

The terms aggregation returns a count of documents for each term value of a given field.

为了使用基于给定字段值的聚合,你需要在映射中将该字段标记为 aggregable

In order to use aggregations based on the value of a given field, you need to mark the field as aggregable in the mapping.

对于全文文本字段,特别是,这不可能;请参阅 here 了解解释和一些解决方案。

This is not possible for full-text fields, in particular; see here for an explanation and some solutions.

对于 geo 点字段,terms 聚合不可用。

The terms aggregation is not available on geo-point fields.

示例 389. 按字段值分组计数命中数

. Example 389. Counting hits grouped by the value of a field

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class ) ) (1)
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey ); (2)
Skipping conversion

默认情况下,terms 聚合返回的值与目标字段对应的实体属性的类型相同。

By default, the values returned by the terms aggregation have the same type as the entity property corresponding to the target field.

例如,如果实体属性是一个枚举类型, the corresponding field may be of type String ;无论如何, terms 聚合返回的值都将是枚举类型的。

For example, if an entity property if of an enum type, the corresponding field may be of type String; the values returned by the terms aggregation will be of the enum type regardless.

这通常是您所希望的,但是如果您需要绕过转换并让未转换的值返回给您(上述示例中为类型 String),您可以这样做:

This should generally be what you want, but if you ever need to bypass conversion and have unconverted values returned to you instead (of type String in the example above), you can do it this way:

示例 390. 按字段值分组计数命中数,不转换字段值

. Example 390. Counting hits grouped by the value of a field, without converting field values

AggregationKey<Map<String, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", String.class, ValueConvert.NO ) )
        .fetch( 20 );
Map<String, Long> countsByGenre = result.aggregation( countsByGenreKey );

请参阅 Type of projected values 以获取更多信息。

See Type of projected values for more information.

maxTermCount: limiting the number of returned entries

默认情况下,Hibernate Search 将最多返回 100 条记录。您可以通过调用 .maxTermCount(…​) 为限制自定义:

By default, Hibernate Search will return at most 100 entries. You can customize the limit by calling .maxTermCount(…​):

示例 391. 在 terms 聚合中设置返回条目的最大数目

. Example 391. Setting the maximum number of returned entries in a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .maxTermCount( 1 ) )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );
minDocumentCount: requiring at least N matching documents per term

默认情况下,仅当文档计数至少为 1 时,Hibernate Search 才会返回记录。

By default, Hibernate search will return an entry only if the document count is at least 1.

您可以通过调用 .minDocumentCount(…​) 将阈值设置为任意值。

You can set the threshold to an arbitrary value by calling .minDocumentCount(…​).

这对于返回索引中的所有术语特别有用,即使不包含该术语的文档与查询相匹配。为此,只需调用 .minDocumentCount(0)

This is particularly useful to return all terms that exist in the index, even if no document containing the term matched the query. To that end, just call .minDocumentCount(0):

示例 392. 在 terms 聚合中包括未匹配文档中的值

. Example 392. Including values from unmatched documents in a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .minDocumentCount( 0 ) )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );

这还可以用于省略文档计数太少而无关紧要的记录:

This can also be used to omit entries with a document count that is too low to matter:

示例 393. 从 terms 聚合中排除最罕见字词

. Example 393. Excluding the rarest terms from a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .minDocumentCount( 2 ) )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );
Order of entries

默认情况下,记录按文档计数降序排列,即匹配文档最多的术语最先出现。

By default, entries are returned in descending order of document count, i.e. the terms with the most matching documents appear first.

有其他多个顺序可用。

Several other orders are available.

使用 Lucene 后端时,由于当前实现的限制,使用除默认值(按降序计数)以外的任何顺序都可能导致不正确的结果。有关更多信息,请参见 HSEARCH-3666

With the Lucene backend, due to limitations of the current implementation, using any order other than the default one (by descending count) may lead to incorrect results. See HSEARCH-3666 for more information.

你可以按升序词值对条目进行排序:

You can order entries by ascending term value:

示例 394. 在 terms 聚合中按升序值排列条目

. Example 394. Ordering entries by ascending value in a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .orderByTermAscending() )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );

您可以按术语值的降序排列记录:

You can order entries by descending term value:

示例 395. 在 terms 聚合中按降序值排列条目

. Example 395. Ordering entries by descending value in a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .orderByTermDescending() )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );

最后,您可以按文档计数的升序排列记录:

Finally, you can order entries by ascending document count:

示例 396. 在 terms 聚合中按升序计数排列条目

. Example 396. Ordering entries by ascending count in a terms aggregation

AggregationKey<Map<Genre, Long>> countsByGenreKey = AggregationKey.of( "countsByGenre" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByGenreKey, f -> f.terms()
                .field( "genre", Genre.class )
                .orderByCountAscending() )
        .fetch( 20 );
Map<Genre, Long> countsByGenre = result.aggregation( countsByGenreKey );

terms 聚合中按升序计数对条目进行排序时, hit counts are approximate

When ordering entries by ascending count in a terms aggregation, hit counts are approximate.

Other options
  1. For fields in nested objects, all nested objects are considered by default, but that can be controlled explicitly with .filter(…​).

15.6.3. range: grouped by ranges of values for a field

range 聚合返回给定字段指定值范围内的文档计数。

The range aggregation returns a count of documents for given ranges of values of a given field.

为了使用基于给定字段值的聚合,你需要在映射中将该字段标记为 aggregable

In order to use aggregations based on the value of a given field, you need to mark the field as aggregable in the mapping.

对于全文文本字段,特别是,这不可能;请参阅 here 了解解释和一些解决方案。

This is not possible for full-text fields, in particular; see here for an explanation and some solutions.

range 聚合在 geo 点字段中不可用。

The range aggregation is not available on geo-point fields.

示例 397. 按字段值的范围分组计数命中

. Example 397. Counting hits grouped by range of values for a field

AggregationKey<Map<Range<Double>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                .field( "price", Double.class ) (1)
                .range( 0.0, 10.0 ) (2)
                .range( 10.0, 20.0 )
                .range( 20.0, null ) (3)
        )
        .fetch( 20 );
Map<Range<Double>, Long> countsByPrice = result.aggregation( countsByPriceKey );
Passing Range arguments

您无需为每个范围传递两个参数(上下界),而可以传递一个 Range 类型的参数。

Instead of passing two arguments for each range (a lower and upper bound), you can pass a single argument of type Range.

示例 398. 按字段值的范围分组计数命中 - 传递 Range 对象

. Example 398. Counting hits grouped by range of values for a field — passing Range objects

AggregationKey<Map<Range<Double>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                .field( "price", Double.class )
                .range( Range.canonical( 0.0, 10.0 ) ) (1)
                .range( Range.between( 10.0, RangeBoundInclusion.INCLUDED,
                        20.0, RangeBoundInclusion.EXCLUDED ) ) (2)
                .range( Range.atLeast( 20.0 ) ) (3)
        )
        .fetch( 20 );
Map<Range<Double>, Long> countsByPrice = result.aggregation( countsByPriceKey );

对于 Elasticsearch 后端,由于 Elasticsearch 本身存在限制,所有范围都必须包含其下限(或 null ),且排除其上限(或 null )。否则,将引发异常。

With the Elasticsearch backend, due to a limitation of Elasticsearch itself, all ranges must have their lower bound included (or null) and their upper bound excluded (or null). Otherwise, an exception will be thrown.

如果您需要排除下限,或包含上限,请改用紧邻的下一个值替换该界限。例如,对于整数, .range( 0, 100 ) 表示“0(包含)到 100(不包含)”。调用 .range( 0, 101 ) 表示“0(包含)到 100(包含)”,或调用 .range( 1, 100 ) 表示“0(不包含)到 100(不包含)”。

If you need to exclude the lower bound, or to include the upper bound, replace that bound with the immediate next value instead. For example with integers, .range( 0, 100 ) means "0 (included) to 100 (excluded)". Call .range( 0, 101 ) to mean "0 (included) to 100 (included)", or .range( 1, 100 ) to mean "0 (excluded) to 100 (excluded)".

也可以传递 Range 对象集合,如果动态定义范围(例如在 Web 界面中),这尤其有用:

It’s also possible to pass a collection of Range objects, which is especially useful if ranges are defined dynamically (e.g. in a web interface):

示例 399. 按字段值的范围分组计数命中 - 传递 Range 对象的集合

. Example 399. Counting hits grouped by range of values for a field — passing a collection of Range objects

List<Range<Double>> ranges =
/* ... */;

AggregationKey<Map<Range<Double>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                .field( "price", Double.class )
                .ranges( ranges )
        )
        .fetch( 20 );
Map<Range<Double>, Long> countsByPrice = result.aggregation( countsByPriceKey );
Skipping conversion

默认情况下,range 聚合接受的范围的界限类型必须与目标字段对应的实体属性类型相同。

By default, the bounds of ranges accepted by the range aggregation must have the same type as the entity property corresponding to the target field.

例如,如果实体属性是 java.util.Date 类型的, the corresponding field may be of type java.time.Instant ;无论如何, terms 聚合返回的值都必须是 java.util.Date 类型。

For example, if an entity property if of type java.util.Date, the corresponding field may be of type java.time.Instant; the values returned by the terms aggregation will have to be of type java.util.Date regardless.

通常,这就是您所需要的,但是如果您需要绕过转换,而是将其返回未转换的值(在上述示例中为 java.time.Instant 类型),则可以通过以下方式实现:

This should generally be what you want, but if you ever need to bypass conversion and have unconverted values returned to you instead (of type java.time.Instant in the example above), you can do it this way:

Example 400. 统计按字段值范围分组的点击次数,无需转换字段值

. Example 400. Counting hits grouped by range of values for a field, without converting field values

AggregationKey<Map<Range<Instant>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                // Assuming "releaseDate" is of type "java.util.Date" or "java.sql.Date"
                .field( "releaseDate", Instant.class, ValueConvert.NO )
                .range( null,
                        LocalDate.of( 1970, 1, 1 )
                                .atStartOfDay().toInstant( ZoneOffset.UTC ) )
                .range( LocalDate.of( 1970, 1, 1 )
                                .atStartOfDay().toInstant( ZoneOffset.UTC ),
                        LocalDate.of( 2000, 1, 1 )
                                .atStartOfDay().toInstant( ZoneOffset.UTC ) )
                .range( LocalDate.of( 2000, 1, 1 )
                                .atStartOfDay().toInstant( ZoneOffset.UTC ),
                        null )
        )
        .fetch( 20 );
Map<Range<Instant>, Long> countsByPrice = result.aggregation( countsByPriceKey );

有关更多信息,请参见 Type of arguments passed to the DSL

See Type of arguments passed to the DSL for more information.

Parse conversion

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

对于范围聚合,还可以使用 ValueConvert.PARSE 并将范围值作为字符串传递。默认情况下,字符串格式应与 Property types with built-in value bridges 中定义的解析逻辑兼容,或者另请参阅如何 customized with bridges

With the range aggregations, it is also possible to use the ValueConvert.PARSE and pass range values as strings. By default, the string format should be compatible with the parsing logic defined in Property types with built-in value bridges, alternatively see how it can be customized with bridges.

Example 401. 使用字符串值创建范围

. Example 401. Using string values to create ranges

AggregationKey<Map<Range<String>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                // Assuming "releaseDate" is of type "java.util.Date" or "java.sql.Date"
                .field( "releaseDate", String.class, ValueConvert.PARSE )
                .range( null,
                        "1970-01-01T00:00:00Z" )
                .range( "1970-01-01T00:00:00Z",
                        "2000-01-01T00:00:00Z" )
                .range( "2000-01-01T00:00:00Z",                                 null )
        )
        .fetch( 20 );

Map<Range<String>, Long> countsByPrice = result.aggregation( countsByPriceKey );
Other options
  1. For fields in nested objects, all nested objects are considered by default, but that can be controlled explicitly with .filter(…​).

15.6.4. withParameters: create aggregations using query parameters

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

withParameters 聚合允许使用 query parameters 构建聚合。

The withParameters aggregation allows building aggregations using query parameters.

此类型的聚合需要一个接受查询参数并返回聚合的函数。该函数将在查询构建时调用。

This type of aggregation requires a function that accepts query parameters and returns an aggregation. That function will get called at query building time.

Example 402. 使用查询参数创建汇总

. Example 402. Creating an aggregation with query parameters

AggregationKey<Map<Range<Double>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.withParameters( params -> f.range() (1)
                .field( "price", Double.class )
                .range( params.get( "bound0", Double.class ), params.get( "bound1", Double.class ) ) (2)
                .range( params.get( "bound1", Double.class ), params.get( "bound2", Double.class ) )
                .range( params.get( "bound2", Double.class ), params.get( "bound3", Double.class ) )
        ) )
        .param( "bound0", 0.0 ) (3)
        .param( "bound1", 10.0 )
        .param( "bound2", 20.0 )
        .param( "bound3", null )
        .fetch( 20 );
Map<Range<Double>, Long> countsByPrice = result.aggregation( countsByPriceKey );

15.6.5. Backend-specific extensions

通过在构建查询时调用 .extension(…​),可以访问后端特定的聚合。

By calling .extension(…​) while building a query, it is possible to access backend-specific aggregations.

顾名思义,后端特定聚合无法从一种后端技术移植到另一种后端技术。

As their name suggests, backend-specific aggregations are not portable from one backend technology to the other.

Elasticsearch: fromJson

.fromJson(…​) 将表示 Elasticsearch 聚合的 JSON 转换为 Hibernate Search 聚合。

.fromJson(…​) turns JSON representing an Elasticsearch aggregation into a Hibernate Search aggregation.

此功能要求在应用程序代码中直接操作 JSON。

This feature requires to directly manipulate JSON in application code.

此 JSON 的语法可能发生更改:

The syntax of this JSON may change:

当您将底层 Elasticsearch 集群升级到下一个版本时;

when you upgrade the underlying Elasticsearch cluster to the next version;

当您将 Hibernate 搜索升级到下一个版本时,即使是对漏洞修复(微型)版本的更新也是如此。

when you upgrade Hibernate Search to the next version, even for a bugfix (micro) release.

如果出现此情况,您将需要更改应用程序代码来应对这些更改。

If this happens, you will need to change application code to deal with the changes.

Example 403. 将本机 Elasticsearch JSON 汇总定义为 JsonObject

. Example 403. Defining a native Elasticsearch JSON aggregation as a JsonObject

JsonObject jsonObject =
/* ... */;
AggregationKey<JsonObject> countsByPriceHistogramKey = AggregationKey.of( "countsByPriceHistogram" );
SearchResult<Book> result = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceHistogramKey, f -> f.fromJson( jsonObject ) )
        .fetch( 20 );
JsonObject countsByPriceHistogram = result.aggregation( countsByPriceHistogramKey ); (1)
Example 404. 将本机 Elasticsearch JSON 汇总定义为 JSON 格式的字符串

. Example 404. Defining a native Elasticsearch JSON aggregation as a JSON-formatted string

AggregationKey<JsonObject> countsByPriceHistogramKey = AggregationKey.of( "countsByPriceHistogram" );
SearchResult<Book> result = searchSession.search( Book.class )
        .extension( ElasticsearchExtension.get() )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceHistogramKey, f -> f.fromJson( "{"
                + "    \"histogram\": {"
                + "        \"field\": \"price\","
                + "        \"interval\": 10"
                + "    }"
                + "}" ) )
        .fetch( 20 );
JsonObject countsByPriceHistogram = result.aggregation( countsByPriceHistogramKey ); (1)

15.6.6. Options common to multiple aggregation types

Filter for fields in nested objects

当聚合字段位于 nested object 中时,默认情况下,聚合将考虑所有嵌套对象,并且该文档将根据在任何嵌套对象中找到的每个值进行一次计数。

When the aggregation field is located in a nested object, by default all nested objects will be considered for the aggregation, and the document will be counted once for each value found in any nested object.

可以使用其中一个 filter(…​) 方法过滤将按其值考虑用于聚合的嵌套文档。

It is possible to filter the nested documents whose values will be considered for the aggregation using one of the filter(…​) methods.

以下是 range aggregation 的示例:聚合的结果是对每个价格范围的图书计数,只考虑“平装”版的价格;例如,电子书版的价格会被忽略。

Below is an example with the range aggregation: the result of the aggregation is a count of books for each price range, with only the price of "paperback" editions being taken into account; the price of e-book editions, for example, is ignored.

Example 405. 统计按字段值范围分组的点击次数,为嵌套对象使用过滤器

. Example 405. Counting hits grouped by range of values for a field, using a filter for nested objects

AggregationKey<Map<Range<Double>, Long>> countsByPriceKey = AggregationKey.of( "countsByPrice" );
SearchResult<Book> result = searchSession.search( Book.class )
        .where( f -> f.matchAll() )
        .aggregation( countsByPriceKey, f -> f.range()
                .field( "editions.price", Double.class )
                .range( 0.0, 10.0 )
                .range( 10.0, 20.0 )
                .range( 20.0, null )
                .filter( pf -> pf.match().field( "editions.label" ).matching( "paperback" ) )
        )
        .fetch( 20 );
Map<Range<Double>, Long> countsByPrice = result.aggregation( countsByPriceKey );

15.7. Field types and compatibility

15.7.1. Type of arguments passed to the DSL

某些谓词,例如 match 谓词或 range 谓词,要求在某个点上类型为 Object 的参数 (matching(Object)atLeast(Object), …​)。类似地,可以在定义缺失值的处理方式时向排序 DSL 中传递类型为 Object 的参数 (missing().use(Object))。

Some predicates, such as the match predicate or the range predicate, require a parameter of type Object at some point (matching(Object), atLeast(Object), …​). Similarly, it is possible to pass an argument of type Object in the sort DSL when defining the behavior for missing values (missing().use(Object)).

这些方法实际上不接受任何对象,且在传递错误类型的参数时将引发异常。

These methods do not actually accept any object, and will throw an exception when passed an argument with the wrong type.

通常,此参数的预期类型应该是显而易见的:例如,如果您通过映射 Integer 属性来创建字段,那么在构建谓词时将期望一个 Integer 值;如果您映射了一个 java.time.LocalDate,那么将期望一个 java.time.LocalDate,等等。

Generally the expected type of this argument should be rather obvious: for example if you created a field by mapping an Integer property, then an Integer value will be expected when building a predicate; if you mapped a java.time.LocalDate, then a java.time.LocalDate will be expected, etc.

如果您开始定义和使用自定义桥接器,事情就会变得稍微复杂一些。然后,您将具有类型为 A 并映射到类型为 B 的索引字段的属性。您应该向 DSL 传递什么?为了回答这个问题,我们需要了解 DSL 转换器。

Things get a little more complex if you start defining and using custom bridges. You will then have properties of type A mapped to an index field of type B. What should you pass to the DSL? To answer that question, we need to understand DSL converters.

DSL 转换器是 Hibernate Search的一项功能,它允许 DSL 接受与索引属性的类型(而不是基础索引字段的类型)匹配的参数。

DSL converters are a feature of Hibernate Search that allows the DSL to accept arguments that match the type of the indexed property, instead of the type of the underlying index field.

每个自定义桥接器都有可能为其填充的索引字段定义 DSL 转换器。当在谓词 DSL 中提及该字段时,Hibernate Search 将使用该 DSL 转换器将传递给 DSL 的值转换为后端理解的值。

Each custom bridge has the possibility to define a DSL converter for the index fields it populates. When it does, every time that field is mentioned in the predicate DSL, Hibernate Search will use that DSL converter to convert the value passed to the DSL to a value that the backend understands.

例如,假设一个具有类型为 AuthenticationOutcomeoutcome 属性的 AuthenticationEvent 实体。此 AuthenticationOutcome 类型是一个枚举。我们编制了 AuthenticationEvent 实体及其 outcome 属性的索引,以便允许用户按其结果查找事件。

For example, let’s imagine an AuthenticationEvent entity with an outcome property of type AuthenticationOutcome. This AuthenticationOutcome type is an enum. We index the AuthenticationEvent entity and its outcome property in order to allow users to find events by their outcome.

枚举的默认桥接器将 Enum.name() 的结果放入 String 字段。但是,此默认桥接器在内部也定义了 DSL 转换器。因此,对 DSL 的任何调用都应传递 AuthenticationOutcome 实例:

The default bridge for enums puts the result of Enum.name() into a String field. However, this default bridge also defines a DSL converter under the hood. As a result, any call to the DSL will be expected to pass an AuthenticationOutcome instance:

Example 406. 透明转换 DSL 参数

. Example 406. Transparent conversion of DSL parameters

List<AuthenticationEvent> result = searchSession.search( AuthenticationEvent.class )
        .where( f -> f.match().field( "outcome" )
                .matching( AuthenticationOutcome.INVALID_PASSWORD ) )
        .fetchHits( 20 );

这是很方便的,尤其是在要求用户从一系列选项中选择结果时。但是,如果我们希望用户键入一些单词,即如果我们希望在 outcome 字段上进行全文搜索,该怎么办?那么,我们将没有 AuthenticationOutcome 实例传递给 DSL,而只有 String…​

This is handy, and especially appropriate if users are asked to select an outcome in a list of choices. But what if we want users to type in some words instead, i.e. what if we want full-text search on the outcome field? Then we will not have an AuthenticationOutcome instance to pass to the DSL, only a String…​

在这种情况下,我们需要首先为每个枚举分配一些文本。通过定义一个自定义的 ValueBridge<AuthenticationOutcome, String> 并将其应用于 outcome 属性,以索引结果的文本描述(而不是默认的 Enum#name() ),可以实现此目的。

In that case, we will first need to assign some text to each enum. This can be achieved by defining a custom ValueBridge<AuthenticationOutcome, String> and applying it to the outcome property to index a textual description of the outcome, instead of the default Enum#name().

然后,我们需要告诉 Hibernate Search,传递给 DSL 的值不应该传递给 DSL 转换器,而是应该假定它与索引字段的类型直接匹配(在本例中为 String)。为此,可以简单地使用接受 ValueConvert 参数的 matching 方法的变体,并传递 ValueConvert.NO

Then, we will need to tell Hibernate Search that the value passed to the DSL should not be passed to the DSL converter, but should be assumed to match the type of the index field directly (in this case, String). To that end, one can simply use the variant of the matching method that accepts a ValueConvert parameter, and pass ValueConvert.NO:

Example 407. 禁用 DSL 转换器

. Example 407. Disabling the DSL converter

List<AuthenticationEvent> result = searchSession.search( AuthenticationEvent.class )
        .where( f -> f.match().field( "outcome" )
                .matching( "Invalid password", ValueConvert.NO ) )
        .fetchHits( 20 );

所有应用 DSL 转换器的方法都提供了一个接受 ValueConvert 参数的变体:matchingbetweenatLeastatMostgreaterThanlessThanrange,…​

All methods that apply DSL converters offer a variant that accepts a ValueConvert parameter: matching, between, atLeast, atMost, greaterThan, lessThan, range, …​

在某些情况下,将字符串值传递给这些 DSL 步骤可能会有所帮助。ValueConvert.PARSE 可用于解决该问题。默认情况下,字符串格式应与 Property types with built-in value bridges 中定义的解析逻辑兼容,或者另请参阅如何 customized with bridges

In some cases, it may be helpful to pass string values to these DSL steps. ValueConvert.PARSE can be used to address that. By default, the string format should be compatible with the parsing logic defined in Property types with built-in value bridges, alternatively see how it can be customized with bridges.

Example 408. 使用 PARSE DSL 转换器处理字符串参数

. Example 408. Using the PARSE DSL converter to work with string arguments

List<AuthenticationEvent> result = searchSession.search( AuthenticationEvent.class )
        .where( f -> f.match().field( "time" )
                .matching( "2002-02-20T20:02:22", ValueConvert.PARSE ) )
        .fetchHits( 20 );

DSL 转换器总是为值桥自动生成。但是,更复杂的桥需要显式配置。

A DSL converter is always automatically generated for value bridges. However, more complex bridges will require explicit configuration.

有关更多信息,请参阅 Type bridge Property bridge

See Type bridge or Property bridge for more information.

15.7.2. Type of projected values

通常,投影返回的值的类型应该是显而易见的:例如,如果您通过映射 Integer 属性来创建字段,那么在投影时将返回 Integer 值;如果您映射了一个 java.time.LocalDate,那么将返回一个 java.time.LocalDate,等等。

Generally the type of values returned by projections should be rather obvious: for example if you created a field by mapping an Integer property, then an Integer value will be returned when projecting; if you mapped a java.time.LocalDate, then a java.time.LocalDate will be returned, etc.

如果您开始定义和使用自定义桥接器,事情就会变得稍微复杂一些。然后,您将具有类型为 A 并映射到类型为 B 的索引字段的属性。投影会返回什么?为了回答这个问题,我们需要了解投影转换器。

Things get a little more complex if you start defining and using custom bridges. You will then have properties of type A mapped to an index field of type B. What will be returned by projections? To answer that question, we need to understand projection converters.

投影转换器是 Hibernate Search 的一项功能,它允许投影返回与索引属性类型匹配的值,而不是底层索引字段的类型。

Projection converters are a feature of Hibernate Search that allows the projections to return values that match the type of the indexed property, instead of the type of the underlying index field.

每个自定义桥梁都有可能为它填充的索引字段定义一个投影转换器。在每次对该字段进行投影时,Hibernate Search 将使用该投影转换器来转换索引返回的投影值。

Each custom bridge has the possibility to define a projection converter for the index fields it populates. When it does, every time that field is projected on, Hibernate Search will use that projection converter to convert the projected value returned by the index.

例如,设想一个 Order 实体具有类型为 OrderStatusstatus 属性。该 OrderStatus 类型是一个枚举。我们索引 Order 实体及其 status 属性。

For example, let’s imagine an Order entity with a status property of type OrderStatus. This OrderStatus type is an enum. We index the Order entity and its status property.

枚举的默认桥梁将 Enum.name() 的结果放入 String 字段。但是,此默认桥梁还定义了一个投影转换器。因此,对 status 字段的任何投影都将返回一个 OrderStatus 实例:

The default bridge for enums puts the result of Enum.name() into a String field. However, this default bridge also defines a projection converter. As a result, any projection on the status field will return an OrderStatus instance:

Example 409. 投影的透明转换

. Example 409. Transparent conversion of projections

List<OrderStatus> result = searchSession.search( Order.class )
        .select( f -> f.field( "status", OrderStatus.class ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

这可能是您通常想要的结果。但在某些情况下,您可能希望禁用此转换,而返回索引值(即 Enum.name() 的值)。

This is probably what you want in general. But in some cases, you may want to disable this conversion and return the index value instead (i.e. the value of Enum.name()).

在这种情况下,我们需要告诉 Hibernate Search 后端返回的值不应传递给投影转换器。为此,可以简单地使用接受 ValueConvert 参数的 field 方法的变体,并传递 ValueConvert.NO

In that case, we will need to tell Hibernate Search that the value returned by the backend should not be passed to the projection converter. To that end, one can simply use the variant of the field method that accepts a ValueConvert parameter, and pass ValueConvert.NO:

Example 410. 禁用投影转换器

. Example 410. Disabling the projection converter

List<String> result = searchSession.search( Order.class )
        .select( f -> f.field( "status", String.class, ValueConvert.NO ) )
        .where( f -> f.matchAll() )
        .fetchHits( 20 );

投影转换器必须在自定义桥中进行显式配置。

Projection converters must be configured explicitly in custom bridges.

有关更多信息,请参阅 Value bridge Property bridge Type bridge

See Value bridge, Property bridge or Type bridge for more information.

15.7.3. Targeting multiple fields

有时,一个谓词/排序/投影会针对多个字段,这些字段可能存在相互冲突的定义:

Sometimes a predicate/sort/projection targets multiple field, which may have conflicting definitions:

  1. when multiple field names are passed to the fields method in the predicate DSL (each field has its own definition);

  2. or when the search query targets multiple indexes (each index has its own definition of each field).

在这种情况,目标字段的定义应保持一致性。例如,在同一个 match 谓词中针对 Integer 字段和 java.time.LocalDate 字段,这将不起作用,因为它不能向 matching(Object) 方法传递非空参数,而这既是 Integer,同时又是 java.time.LocalDate

In such cases, the definition of the targeted fields is expected to be compatible. For example targeting an Integer field and a java.time.LocalDate field in the same match predicate will not work, because you won’t be able to pass a non-null argument to the matching(Object) method that is both an Integer and a java.time.LocalDate.

如果您正在寻找一个简单的经验法则,那就是:如果索引属性类型不同,或映射不同,则相应的字段可能不兼容。

If you are looking for a simple rule of thumb, here it is: if the indexed properties do not have the same type, or are mapped differently, the corresponding fields are probably not going to be compatible.

但是,如果您有兴趣了解详情,Hibernate Search 在这方面会更灵活一些。

However, if you’re interested in the details, Hibernate Search is a bit more flexible than that.

针对字段兼容性,有三个不同的约束:

There are three different constraints when it comes to field compatibility:

  • The fields must be "encoded" in a compatible way. This means the backend must use the same representation for the two fields, for example they are both Integer, or they are both BigDecimal with the same decimal scale, or they are both LocalDate with the same date format, etc.

  • The fields must have a compatible DSL converter (for predicates and sorts) or projection converter (for projections).

  • For full-text predicates, the fields must have a compatible analyzer.

以下部分描述了所有可能的不兼容情况,以及如何解决这些情况。

The following sections describe all the possible incompatibilities, and how to solve them.

Incompatible codec

在针对多个索引的搜索查询中,如果某个字段在每个索引中的编码方式不同,则不可对此字段应用谓词、排序或投影。

In a search query targeting multiple indexes, if a field is encoded differently in each index, you cannot apply predicates, sorts or projections on that field.

在这种情况下,您唯一的选择是更改映射以避免冲突:

In that case, your only option is to change your mapping to avoid the conflict:

  • rename the field in one index

  • OR change the field type in one index

  • OR if the problem is simply different codec parameters (date format, decimal scale, …​), align the value of these parameters in one index with the other index.

如果您选择在一个索引中重命名字段,那么您仍然可以在单个查询中将类似的谓词应用于两个字段:您将不得不为每个字段创建一个谓词,并通过 boolean junction 将它们合并。

If you choose to rename the field in one index, you will still be able to apply a similar predicate to the two fields in a single query: you will have to create one predicate per field and combine them with a boolean junction.

Incompatible DSL converters

不兼容的 DSL 转换器仅在您需要在某些方法中向 DSL 传递参数时才成问题:谓词 DSL 中的 matching(Object)/between(Object)/atLeast(Object)/greaterThan(Object)/等,聚合 DSL 中的 missing().use(Object) in the sort DSL, `range(Object, Object),……

Incompatible DSL converters are only a problem when you need to pass an argument to the DSL in certain methods: matching(Object)/between(Object)/atLeast(Object)/greaterThan(Object)/etc. in the predicate DSL, missing().use(Object) in the sort DSL, `range(Object, Object) in the aggregation DSL, …​

如果两个字段以兼容的方式进行编码(例如,两者都作为 String),但具有不同的 DSL 转换器(例如,第一个转换器从 String 转换为 String,第二个转换器从 Integer 转换为 String),您仍然可以使用这些方法,但您需要禁用 DSL 转换器,如 Type of arguments passed to the DSL 中所述:您只需将“index”值传递给 DSL(使用相同的示例,即 String)。

If two fields encoded in a compatible way (for example both as String), but that have different DSL converters (for example the first one converts from String to String, but the second one converts from Integer to String), you can still use these methods, but you will need to disable the DSL converter as explained in Type of arguments passed to the DSL: you will just pass the "index" value to the DSL (using the same example, a String).

Incompatible projection converters

如果在针对多个索引的搜索查询中,某个字段在每个索引中都以兼容的方式进行编码(例如,两者都作为 String),但具有不同的投影转换器(例如,第一个转换器从 String 转换为 String,第二个转换器从 String 转换为 Integer),您仍然可以投影到此字段,但您需要禁用投影转换器,如 Type of projected values 所述:投影将返回“索引”,未转换值(使用相同的示例,即 String)。

If, in a search query targeting multiple indexes, a field is encoded in a compatible way in every index (for example both as String), but that has a different projection converters (for example the first one converts from String to String, but the second one converts from String to Integer), you can still project on this field, but you will need to disable the projection converter as explained in Type of projected values: the projection will return the "index", unconverted value (using the same example, a String).

Incompatible analyzer

不兼容的分析器仅在全文谓词中存在问题:文本字段上的匹配谓词、词组谓词、简单查询字符串谓词,……

Incompatible analyzers are only a problem with full-text predicates: match predicate on a text field, phrase predicate, simple query string predicate, …​

如果两个字段的编码方式兼容(例如,都为 String),但具有不同的分析器,您仍然可以使用这些谓词,但您需要显式配置谓词,以使用 .analyzer(analyzerName) 设置搜索分析器为选定的分析器,或使用 .skipAnalysis() 完全跳过分析。

If two fields encoded in a compatible way (for example both as String), but that have different analyzers, you can still use these predicates, but you will need to explicitly configure the predicate to either set the search analyzer to an analyzer of your choosing with .analyzer(analyzerName), or skip analysis completely with .skipAnalysis().

请参阅 Predicate DSL 以获取有关如何创建谓词以及可用选项的更多信息。

See Predicate DSL for more information about how to create predicates and about the available options.

15.8. Field paths

15.8.1. Absolute field paths

默认情况下,传递到搜索 DSL 的字段路径解释为绝对路径,即相对索引根路径。

By default, field paths passed to the Search DSL are interpreted as absolute, i.e. relative to the index root.

路径的各个组成部分通过点号 (.) 分隔。

The components of the paths are separated by a dot (.).

以下示例使用 predicate DSL,但本部分中的所有信息也适用于其他搜索 DSL: sort DSLprojection DSLaggregation DSL 等。

The following examples use the predicate DSL, but all information in this section applies to other search DSLs as well: sort DSL, projection DSL, aggregation DSL, …​

Example 411. 使用绝对路径定位字段

. Example 411. Targeting a field using absolute paths

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match().field( "title" ) (1)
                .matching( "robot" ) )
        .fetchHits( 20 );
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.match().field( "writers.firstName" ) (1)
                .matching( "isaac" ) )
        .fetchHits( 20 );
List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.nested( "writers" )
                .add( f.match().field( "writers.firstName" ) (1)
                        .matching( "isaac" ) )
                .add( f.match().field( "writers.lastName" )
                        .matching( "asimov" ) )
        )
        .fetchHits( 20 );

唯一的例外是注册在目标字段中的 named predicates :默认情况下,用于构建这些谓词的工厂将字段路径解释为相对于该目标字段。

The only exception is named predicates registered on object fields: the factory used to build those predicates interprets field paths as relative to that object field by default.

15.8.2. Relative field paths

以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。

Features detailed below are incubating: they are still under active development.

通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。

The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.

我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。

You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.

在某些情况下,可能希望传递相对路径。在调用可将同一谓词应用于具有相同结构的不同对象字段(相同子字段)的可重复使用的方法时,这很有用。通过在工厂上调用 withRoot(String) 方法,你可以创建一个新工厂,它将解释为相对于作为该方法参数传递的对象字段的路径。

In some cases, one may want to pass relative paths instead. This can be useful when calling reusable methods that can apply the same predicate on different object fields that have same structure (same subfields). By calling the withRoot(String) method on a factory, you can create a new factory which interprets paths as relative to the object field passed as argument to the method.

示例 412. 使用相对路径指定目标字段

. Example 412. Targeting a field using relative paths

List<Book> hits = searchSession.search( Book.class )
        .where( f -> f.or()
                .add( f.nested( "writers" )
                        .add( matchFirstAndLastName( (1)
                                f.withRoot( "writers" ), (2)
                                "bob", "kane" ) ) )
                .add( f.nested( "artists" )
                        .add( matchFirstAndLastName( (3)
                                f.withRoot( "artists" ), (4)
                                "bill", "finger" ) ) ) )
        .fetchHits( 20 );
private SearchPredicate matchFirstAndLastName(SearchPredicateFactory f,
        String firstName, String lastName) {
    return f.and(
            f.match().field( "firstName" ) (1)
                    .matching( firstName ),
            f.match().field( "lastName" )
                    .matching( lastName )
    )
            .toPredicate();
}

构建原生构造器时(例如 Lucene Queries ),你需要处理绝对路径,即使工厂接受相对路径。

When building native constructs (for example Lucene Queries), you will need to deal with absolute paths, even if the factory accepts relative paths.

若要将相对路径转换为绝对路径,请使用工厂的 toAbsolutePath(String) 方法。

To convert a relative path to an absolute path, use the factory’s toAbsolutePath(String) method.