Hibernate Search 中文操作指南
10. Mapping entities to indexes
10.1. Configuring the mapping
10.1.1. Annotation-based mapping
如 Entity definition、 Entity/index mapping 和以下部分中所述,将实体映射到索引的主要方法是使用注释。
The main way to map entities to indexes is through annotations, as explained in Entity definition, Entity/index mapping and the following sections.
默认情况下,Hibernate Search 会自动处理实体类型的映射注释、以及这些实体类型中的嵌套类型(例如嵌入类型)。
By default, Hibernate Search will automatically process mapping annotations for entity types, as well as nested types in those entity types, for instance embedded types.
可以通过为 Hibernate ORM integration 设置 hibernate.search.mapping.process_annotations 为 false,或通过 AnnotationMappingConfigurationContext 为任何映射器来禁用基于注释的映射:请参见 Mapping configurer 以访问该上下文,并参见 AnnotationMappingConfigurationContext 的 javadoc 以了解可用选项。
Annotation-based mapping can be disabled by setting hibernate.search.mapping.process_annotations to false for the Hibernate ORM integration, or through AnnotationMappingConfigurationContext for any mapper: see Mapping configurer to access that context, and see the javadoc of AnnotationMappingConfigurationContext for available options.
如果您禁用基于注释的映射,则可能需要通过编程方式配置映射:请参见 Programmatic mapping。 |
If you disable annotation-based mapping, you will probably need to configure the mapping programmatically: see Programmatic mapping. |
Hibernate Search 还会尝试通过 classpath scanning 找到一些带注释的类型。
Hibernate Search will also try to find some annotated types through classpath scanning.
请参阅 Entity definition 、 Entity/index mapping 和 Mapping a property to an index field with @GenericField, @FullTextField, … ,了解有关基于注释的映射的入门知识。 |
See Entity definition, Entity/index mapping and Mapping a property to an index field with @GenericField, @FullTextField, … to get started with annotation-based mapping. |
10.1.2. Classpath scanning
Basics
Hibernate Search 会在启动时自动扫描实体类型的 JAR,查找以“根映射注释”进行注释的类型,以便将这些类型自动添加到应处理其注释的类型列表。
Hibernate Search will automatically scan the JARs of entity types on startup, looking for types annotated with "root mapping annotations" so that it can automatically add those types to the list of types whose annotations should be processed.
根映射注释是作为映射入口点的映射注释,例如 @ProjectionConstructor 或 custom root mapping annotations 。如果没有这种扫描,Hibernate Search 将了解诸如投影构造函数之类的内容太晚了(当投影实际被执行时),并且会因缺少元数据而失败。
Root mapping annotations are mapping annotations that serve as the entrypoint to a mapping, for example @ProjectionConstructor or custom root mapping annotations. Without this scanning, Hibernate Search would learn about e.g. projection constructors too late (when the projection is actually executed) and would fail due to a lack of metadata.
该扫描由 Jandex 支持,后者是一个索引 JAR 的内容的库。
The scanning is backed by Jandex, a library that indexes the content of JARs.
Scanning dependencies of the application
默认情况下,Hibernate Search 仅扫描包含您的 Hibernate ORM 实体的 JAR。
By default, Hibernate Search will only scan the JARs containing your Hibernate ORM entities.
如果您希望 Hibernate Search 在其他 JAR 中检测到用 root mapping annotations 注释的类型,则首先需要 access an AnnotationMappingConfigurationContext 。
If you want Hibernate Search to detect types annotated with root mapping annotations in other JARs, you will first need to access an AnnotationMappingConfigurationContext.
从该上下文,可能执行以下操作之一:
From that context, either:
-
call annotationMappingContext.add( MyType.class ) to explicitly tell Hibernate Search to process annotation on MyType, and to discover other types annotated with root mapping annotations in the JAR containing MyType.
-
OR (advanced usage, incubating) call annotationMappingContext.addJandexIndex( <an IndexView instance> ) to explicitly tell Hibernate Search to look for types annotated with root mapping annotations in the given Jandex index.
Configuring scanning
Hibernate Search 的扫描可能会在应用程序启动时通过 Jandex 触发 JAR 索引。在一些更多复杂的环境中,此索引可能无法获得对要索引类的访问权限,或可能会不必要地减慢启动速度。
Hibernate Search’s scanning may trigger the indexing of JARs through Jandex on application startup. In some of the more complicated environments, this indexing may not be able to get access to classes to index, or may unnecessarily slow down startup.
在 Quarkus 或 Wildfly 中运行 Hibernate Search 具有以下好处:
Running Hibernate Search within Quarkus or Wildfly has its benefits as:
-
With the Quarkus framework, scanning part of the Hibernate Search’s startup is executed at build time and the indexes are provided to it automatically.
-
With the WildFly application server, this part of Hibernate Search’s startup is executed in an optimized way and the indexes are provided to it automatically as well.
在其他情况下,根据应用程序需要,可在应用程序的构建阶段使用 Jandex Maven 插件,以便在应用程序启动时已经构建好索引并准备就绪。
In other cases, depending on the application needs, the Jandex Maven Plugin can be used during the building stage of the application, so that indexes are already built and ready when the application starts.
或者,如果你的应用程序不使用 @ProjectionConstructor 或 custom root mapping annotations,则可以完全或部分禁用此功能。
Alternatively, If your application does not use @ProjectionConstructor or custom root mapping annotations, you may want to disable this feature entirely or partially.
通常不建议这样做,因为它可能导致引导失败或映射注释被忽略,因为 Hibernate Search 将不再能够自动发现 JAR(没有嵌入 Jandex 索引)中用 root annotations 注释的类型。
This is not recommended in general as it may lead to bootstrap failures or ignored mapping annotations because Hibernate Search will no longer be able to automatically discover types annotated with root annotations in JARs that do not have an embedded Jandex index.
为此,有两个选项可用:
Two options are available for this:
-
Setting hibernate.search.mapping.discover_annotated_types_from_root_mapping_annotations to false will disable any attempts of automatic discovery, even if there is a Jandex index available, partial or full, which may help if there are no types annotated with root mapping annotations at all, or if they are listed explicitly through a mapping configurer or through an AnnotatedTypeSource.
-
Setting hibernate.search.mapping.build_missing_discovered_jandex_indexes to false will disable Jandex index building on startup, but will still use any pre-built Jandex indexes available. This may help if partial automatic discovery is required, i.e. available indexes will be used for discovery, but sources that do not have an index available will be ignored unless their @ProjectionConstructor-annotated types are listed explicitly through a mapping configurer or through an AnnotatedTypeSource.
10.1.3. Programmatic mapping
此文档中的大多数示例使用基于注释的映射,这通常足以满足大多数应用程序的需求。然而,一些应用程序的需求超出了注释所能提供的范围:
Most examples in this documentation use annotation-based mapping, which is generally enough for most applications. However, some applications have needs that go beyond what annotations can offer:
-
a single entity type must be mapped differently for different deployments — e.g. for different customers.
-
many entity types must be mapped similarly, without code duplication.
要解决这些需求,您可以使用 programmatic 映射:通过将在启动时执行的代码定义映射。
To address those needs, you can use programmatic mapping: define the mapping through code that will get executed on startup.
通过 ProgrammaticMappingConfigurationContext 配置编程映射:请参见 Mapping configurer 以访问该上下文。
Programmatic mapping is configured through ProgrammaticMappingConfigurationContext: see Mapping configurer to access that context.
默认情况下,程序映射将与注释映射(如果有)合并。 |
By default, programmatic mapping will be merged with annotation mapping (if any). |
要禁用注释映射,请参阅 Annotation-based mapping 。
To disable annotation mapping, see Annotation-based mapping.
程序映射是声明式的,它显示与基于注释的映射完全相同的功能。 |
Programmatic mapping is declarative and exposes the exact same features as annotation-based mapping. |
为了实现更复杂、更“命令式”的映射(例如将两个实体属性组合成一个索引字段),请使用 custom bridges 。
In order to implement more complex, "imperative" mapping, for example to combine two entity properties into a single index field, use custom bridges.
或者,如果您只需要为多种类型或属性重复相同的映射,则可以在那些类型或属性上应用自定义注释,在 Hibernate Search 遇到该注释时执行一些程序映射代码。此解决方案不需要特殊于映射器的配置。 |
Alternatively, if you only need to repeat the same mapping for several types or properties, you can apply a custom annotation on those types or properties, and have Hibernate Search execute some programmatic mapping code when it encounters that annotation. This solution doesn’t require mapper-specific configuration. |
有关更多信息,请参阅 Custom mapping annotations 。
See Custom mapping annotations for more information.
10.1.4. Mapping configurer
Hibernate ORM integration
通过 Hibernate ORM 集成,一个自定义 HibernateOrmSearchMappingConfigurer 可以插入到 Hibernate Search 中,以便配置注释映射 (AnnotationMappingConfigurationContext)、编程映射 (ProgrammaticMappingConfigurationContext) 等。
With the Hibernate ORM integration, a custom HibernateOrmSearchMappingConfigurer can be plugged into Hibernate Search in order to configure annotation mapping (AnnotationMappingConfigurationContext), programmatic mapping (ProgrammaticMappingConfigurationContext), and more.
插入一个自定义配置器需要两个步骤:
Plugging in a custom configurer requires two steps:
-
Define a class that implements the org.hibernate.search.mapper.orm.mapping.HibernateOrmSearchMappingConfigurer interface.
-
Configure Hibernate Search to use that implementation by setting the configuration property hibernate.search.mapping.configurer to a bean reference pointing to the implementation, for example class:com.mycompany.MyMappingConfigurer.
Hibernate Search 在启动时会调用此实现的 configure 方法,并且配置器将能够利用一个 DSL 来配置注释映射或定义编程映射,例如:
Hibernate Search will call the configure method of this implementation on startup, and the configurer will be able to take advantage of a DSL to configure annotation mapping or define the programmatic mapping, for example:
示例 16. 通过 Hibernate ORM 集成实现映射配置器
. Example 16. Implementing a mapping configurer with the Hibernate ORM integration
public class MySearchMappingConfigurer implements HibernateOrmSearchMappingConfigurer {
@Override
public void configure(HibernateOrmMappingConfigurationContext context) {
ProgrammaticMappingConfigurationContext mapping = context.programmaticMapping(); (1)
TypeMappingStep bookMapping = mapping.type( Book.class ); (2)
bookMapping.indexed(); (3)
bookMapping.property( "title" ) (4)
.fullTextField().analyzer( "english" ); (5)
}
}
Standalone POJO Mapper
Standalone POJO Mapper 目前不提供“映射配置器”( HSEARCH-4615)。但是,可在构建 SearchMapping 时访问 AnnotationMappingConfigurationContext 和 ProgrammaticMappingConfigurationContext:
The Standalone POJO Mapper does not offer a "mapping configurer" at the moment (HSEARCH-4615). However, AnnotationMappingConfigurationContext and ProgrammaticMappingConfigurationContext can be accessed when building the SearchMapping:
通过 Hibernate ORM 集成,一个自定义 StandalonePojoMappingConfigurer 可以插入到 Hibernate Search 中,以便配置注释映射 (AnnotationMappingConfigurationContext)、编程映射 (ProgrammaticMappingConfigurationContext) 等。
With the Hibernate ORM integration, a custom StandalonePojoMappingConfigurer can be plugged into Hibernate Search in order to configure annotation mapping (AnnotationMappingConfigurationContext), programmatic mapping (ProgrammaticMappingConfigurationContext), and more.
插入一个自定义配置器需要两个步骤:
Plugging in a custom configurer requires two steps:
-
Define a class that implements the org.hibernate.search.mapper.pojo.standalone.mapping.StandalonePojoMappingConfigurer interface.
-
Configure Hibernate Search to use that implementation by setting the configuration property hibernate.search.mapping.configurer to a bean reference pointing to the implementation, for example class:com.mycompany.MyMappingConfigurer.
Hibernate Search 在启动时会调用此实现的 configure 方法,并且配置器将能够利用一个 DSL 来配置注释映射或定义编程映射,例如:
Hibernate Search will call the configure method of this implementation on startup, and the configurer will be able to take advantage of a DSL to configure annotation mapping or define the programmatic mapping, for example:
示例 17. 通过独立 POJO 映射器实现映射配置器
. Example 17. Implementing a mapping configurer with the Standalone POJO Mapper
public class MySearchMappingConfigurer implements StandalonePojoMappingConfigurer {
@Override
public void configure(StandalonePojoMappingConfigurationContext context) {
context.annotationMapping() (1)
.discoverAnnotationsFromReferencedTypes( false )
.discoverAnnotatedTypesFromRootMappingAnnotations( false );
ProgrammaticMappingConfigurationContext mappingContext = context.programmaticMapping(); (2)
TypeMappingStep bookMapping = mappingContext.type( Book.class ); (3)
bookMapping.searchEntity(); (4)
bookMapping.indexed(); (5)
bookMapping.property( "id" ) (6)
.documentId(); (7)
bookMapping.property( "title" ) (8)
.fullTextField().analyzer( "english" ); (9)
}
}
10.2. Entity definition
10.2.1. Basics
在类型可以成为 mapped to indexes 之前,Hibernate Search 需要知晓应用程序领域模型中的哪些类型是 entity types。
Before a type can be mapped to indexes, Hibernate Search needs to be aware of which types in the application domain model are entity types.
当 indexing Hibernate ORM entities 时,实体类型由 Hibernate ORM 完全定义(通常通过 Jakarta 的 @Entity 注释),并且不需要明确的定义:您可以安全地跳过这一整节。
When indexing Hibernate ORM entities, the entity types are fully defined by Hibernate ORM (generally through Jakarta’s @Entity annotation), and no explicit definition is necessary: you can safely skip this entire section.
在使用 Standalone POJO Mapper 时,实体类型需要是 defined explicitly。
When using the Standalone POJO Mapper, entity types need to be defined explicitly.
10.2.2. Explicit entity definition
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
@SearchEntity 及其相应的程序映射 .searchEntity() 对于 Hibernate ORM 实体是没必要的,并且在使用 Hibernate ORM integration 时事实上也不受支持。
@SearchEntity and its corresponding programmatic mapping .searchEntity() are unnecessary for Hibernate ORM entities, and in fact unsupported when using the Hibernate ORM integration.
请参阅 HSEARCH-5076 以追踪允许在 Hibernate ORM 集成中使用 @SearchEntity 映射非 ORM 实体的进度。
See HSEARCH-5076 to track progress on allowing the use of @SearchEntity in the Hibernate ORM integration to map non-ORM entities.
在 Standalone POJO Mapper 中, entity types 必须用 @SearchEntity 注解明确标记。
With the Standalone POJO Mapper, entity types must be marked explicitly with the @SearchEntity annotation.
. Example 18. Marking a class as an entity with @SearchEntity
@SearchEntity (1)
@Indexed (2)
public class Book {
即使类型拥有复合结构,但并非所有类型都是实体类型。
Not all types are entity types, even if they have a composite structure.
错误地将类型标记为实体类型可能迫使您在领域模型中添加不必要的复杂度,例如不会被使用 defining identifiers 或 an inverse side for "associations" to such types 。
Incorrectly marking types as entity types may force you to add unnecessary complexity to your domain model, such as defining identifiers or an inverse side for "associations" to such types that won’t get used.
请务必阅读 this section 以了解更多关于实体类型是什么以及它们为什么是必要的信息。
Make sure to read this section for more information on what entity types are and why they are necessary.
子类不会继承 @SearchEntity 注解。 |
Subclasses do not inherit the @SearchEntity annotation. |
每个子类都必须使用 @SearchEntity 进行注释,否则它将不会被 Hibernate Search 视为实体。
Each subclass must be annotated with @SearchEntity as well, or it will not be considered as an entity by Hibernate Search.
但是,对于同时使用 @SearchEntity 进行注释的子类型,可继承某些实体相关配置;请参阅相关章节了解详情。
However, for subclasses that are also annotated with @SearchEntity, some entity-related configuration can be inherited; see the relevant sections for details.
在使用 Standalone POJO Mapper 时,默认情况下:
By default, with the Standalone POJO Mapper:
-
The entity name will be equal to the class' simple name (java.lang.Class#getSimpleName).
-
The entity will not be configured for loading, be it to return entities as hits in search queries or for mass indexing.
请参阅以下部分来覆盖这些默认值。
See the following sections to override these defaults.
10.2.3. Entity name
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
entity 名称(不同于相应类的名称)涉及多个地方,包括但不限于:
The entity name, distinct from the name of the corresponding class, is involved in various places, including but not limited to:
-
as the default index name for @Indexed;
实体名称默认为类的简单名称 (java.lang.Class#getSimpleName)。
The entity name defaults to the class' simple name (java.lang.Class#getSimpleName).
要更改 indexed 实体的实体名称,可能需要 full reindexing ,尤其是在使用 Elasticsearch/OpenSearch backend 时。
Changing the entity name of an indexed entity may require full reindexing, in particular when using the Elasticsearch/OpenSearch backend.
请参阅 this section 以了解更多信息。
See this section for more information.
使用 Hibernate ORM integration 时,此名称可以通过不同方式覆盖,但通常是通过 Jakarta Persistence 的 @Entity 注解,例如,使用 @Entity(name = …)。
With the Hibernate ORM integration, this name may be overridden through various means, but usually is through Jakarta Persistence’s @Entity annotation, i.e. with @Entity(name = …).
对于 Standalone POJO Mapper ,实体类型是 defined with @SearchEntity ,且实体名称可以使用 @SearchEntity(name = …) 覆盖。
With the Standalone POJO Mapper, entity types are defined with @SearchEntity, and the entity name may be overridden with @SearchEntity(name = …).
@SearchEntity 及其相应的程序映射 .searchEntity() 对于 Hibernate ORM 实体是没必要的,并且在使用 Hibernate ORM integration 时事实上也不受支持。
@SearchEntity and its corresponding programmatic mapping .searchEntity() are unnecessary for Hibernate ORM entities, and in fact unsupported when using the Hibernate ORM integration.
请参阅 HSEARCH-5076 以追踪允许在 Hibernate ORM 集成中使用 @SearchEntity 映射非 ORM 实体的进度。
See HSEARCH-5076 to track progress on allowing the use of @SearchEntity in the Hibernate ORM integration to map non-ORM entities.
. Example 19. Setting a custom entity name with @SearchEntity(name = …)
@SearchEntity(name = "MyAuthorName")
@Indexed
public class Author {
10.2.4. Mass loading strategy
“大容量加载策略”使得 Hibernate Search 能够针对 mass indexing 加载给定类型的实体。
A "mass loading strategy" gives Hibernate Search the ability to load entities of a given type for mass indexing.
使用 Hibernate ORM integration 时,针对每一个 Hibernate ORM 实体都会自动配置大容量加载策略,无需进一步配置。
With the Hibernate ORM integration, a mass loading strategy gets configured automatically for every single Hibernate ORM entity, and no further configuration is required.
对于 Standalone POJO Mapper ,实体类型是 defined with @SearchEntity ,且为了利用大批量索引的优势,必须使用 @SearchEntity(loadingBinder = …) 显式地应用大批量加载策略。
With the Standalone POJO Mapper, entity types are defined with @SearchEntity, and, in order to take advantage of mass indexing, a mass loading strategy must be applied explicitly with @SearchEntity(loadingBinder = …).
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
@SearchEntity 及其相应的程序映射 .searchEntity() 对于 Hibernate ORM 实体是没必要的,并且在使用 Hibernate ORM integration 时事实上也不受支持。
@SearchEntity and its corresponding programmatic mapping .searchEntity() are unnecessary for Hibernate ORM entities, and in fact unsupported when using the Hibernate ORM integration.
请参阅 HSEARCH-5076 以追踪允许在 Hibernate ORM 集成中使用 @SearchEntity 映射非 ORM 实体的进度。
See HSEARCH-5076 to track progress on allowing the use of @SearchEntity in the Hibernate ORM integration to map non-ORM entities.
. Example 20. Assigning a mass loading strategy with the Standalone POJO Mapper
@SearchEntity(loadingBinder = @EntityLoadingBinderRef(type = MyLoadingBinder.class)) (1)
@Indexed
public class Book {
@Singleton
public class MyLoadingBinder implements EntityLoadingBinder { (1)
private final MyDatastore datastore;
@Inject (2)
public MyLoadingBinder(MyDatastore datastore) {
this.datastore = datastore;
}
@Override
public void bind(EntityLoadingBindingContext context) { (3)
context.massLoadingStrategy( (4)
Book.class, (5)
new MyMassLoadingStrategy<>( datastore, Book.class ) (6)
);
}
}
下面是针对虚拟数据存储的 MassLoadingStrategy 实现示例。
Below is an example of MassLoadingStrategy implementation for an imaginary datastore.
. Example 21. Implementing MassLoadingStrategy
public class MyMassLoadingStrategy<E>
implements MassLoadingStrategy<E, String> {
private final MyDatastore datastore; (1)
private final Class<E> rootEntityType;
public MyMassLoadingStrategy(MyDatastore datastore, Class<E> rootEntityType) {
this.datastore = datastore;
this.rootEntityType = rootEntityType;
}
@Override
public MassIdentifierLoader createIdentifierLoader(
LoadingTypeGroup<E> includedTypes, (2)
MassIdentifierSink<String> sink, MassLoadingOptions options) {
int batchSize = options.batchSize(); (3)
Collection<Class<? extends E>> typeFilter =
includedTypes.includedTypesMap().values(); (4)
return new MassIdentifierLoader() {
private final MyDatastoreConnection connection =
datastore.connect(); (5)
private final MyDatastoreCursor<String> identifierCursor =
connection.scrollIdentifiers( typeFilter );
@Override
public void close() {
connection.close(); (5)
}
@Override
public long totalCount() { (6)
return connection.countEntities( typeFilter );
}
@Override
public void loadNext() throws InterruptedException {
List<String> batch = identifierCursor.next( batchSize );
if ( batch != null ) {
sink.accept( batch ); (7)
}
else {
sink.complete(); (8)
}
}
};
}
@Override
public MassEntityLoader<String> createEntityLoader(
LoadingTypeGroup<E> includedTypes, (9)
MassEntitySink<E> sink, MassLoadingOptions options) {
return new MassEntityLoader<String>() {
private final MyDatastoreConnection connection =
datastore.connect(); (10)
@Override
public void close() { (8)
connection.close();
}
@Override
public void load(List<String> identifiers)
throws InterruptedException {
sink.accept( (11)
connection.loadEntitiesById( rootEntityType, identifiers )
);
}
};
}
}
Hibernate Search 会通过将具有相同 MassLoadingStrategy 的类型或不同策略(根据 equals() / hashCode() 相等)分组在一起,来优化加载。 |
Hibernate Search will optimize loading by grouping together types that have the same MassLoadingStrategy, or different strategies that are equal according to equals()/hashCode(). |
在将类型分组在一起时,只会调用其中一种策略,且它将传递一个“类型组”,其中包括应该加载的所有类型。
When grouping types together, only one of the strategies will be called, and it will get passed a "type group" that includes all types that should be loaded.
在从“父级”实体类型配置了加载绑定器时,当子类型继承了它并为子类型设置了相同的策略时,尤其会发生这种情况。
This happens in particular when configuring the loading binder from a "parent" entity type is inherited by subtypes, and sets the same strategy on subtypes.
当传递给 createIdentifierLoader 方法的“类型组”包含父类型(例如,Animal)且没有一个子类型(既没有 Lion 也没有 Zebra)时,请小心继承树中的非抽象(可实例化)父类,那么加载器实际上应该只加载父类型实例的标识符,而不是子类型的标识符(应该加载类型完全是 Animal 的实体的标识符,而不是 Lion 或 Zebra)。
Be careful of non-abstract (instantiable) parent classes in inheritance trees: when the "type group" passed to the createIdentifierLoader method contains a parent type (say, Animal) and none of the subtypes (neither Lion nor Zebra), then the loader really should only load identifiers of instances of the parent type, not of its subtypes (it should load identifiers of entities whose type is exactly Animal, not Lion nor Zebra).
一旦所有要重新索引的类型实现了其大容量加载策略并分配给它们,就可以使用 mass indexer 重新索引它们:
Once all types to reindex have their mass loading strategy implemented and assigned, they can be reindexed using the mass indexer:
示例 22. 使用单机 POJO 映射器进行批量索引
. Example 22. Mass indexing with the Standalone POJO Mapper
SearchMapping searchMapping = /* ... */ (1)
searchMapping.scope( Object.class ).massIndexer() (2)
.startAndWait(); (3)
10.2.5. Selection loading strategy
“选择性加载策略”使得 Hibernate Search 能够加载给定类型的实体来 return entities loaded from an external source as hits in search queries。
A "selection loading strategy" gives Hibernate Search the ability to load entities of a given type to return entities loaded from an external source as hits in search queries.
使用 Hibernate ORM integration 时,针对每一个 Hibernate ORM 实体都会自动配置选择性加载策略,无需进一步配置。
With the Hibernate ORM integration, a selection loading strategy gets configured automatically for every single Hibernate ORM entity, and no further configuration is required.
使用 Standalone POJO Mapper ,实体类型为 defined with @SearchEntity ,并且为了在搜索查询中返回从外部源加载的实体,必须显式地使用 @SearchEntity(loadingBinder = …) 应用选择加载策略。
With the Standalone POJO Mapper, entity types are defined with @SearchEntity, and, in order to return entities loaded from an external source in search queries, a selection loading strategy must be applied explicitly with @SearchEntity(loadingBinder = …).
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
@SearchEntity 及其相应的程序映射 .searchEntity() 对于 Hibernate ORM 实体是没必要的,并且在使用 Hibernate ORM integration 时事实上也不受支持。
@SearchEntity and its corresponding programmatic mapping .searchEntity() are unnecessary for Hibernate ORM entities, and in fact unsupported when using the Hibernate ORM integration.
请参阅 HSEARCH-5076 以追踪允许在 Hibernate ORM 集成中使用 @SearchEntity 映射非 ORM 实体的进度。
See HSEARCH-5076 to track progress on allowing the use of @SearchEntity in the Hibernate ORM integration to map non-ORM entities.
示例 23. 使用单机 POJO 映射器分配选择加载策略
. Example 23. Assigning a selection loading strategy with the Standalone POJO Mapper
@SearchEntity(loadingBinder = @EntityLoadingBinderRef(type = MyLoadingBinder.class)) (1)
@Indexed
public class Book {
@Singleton
public class MyLoadingBinder implements EntityLoadingBinder { (1)
@Override
public void bind(EntityLoadingBindingContext context) { (2)
context.selectionLoadingStrategy( (3)
Book.class, (4)
new MySelectionLoadingStrategy<>( Book.class ) (5)
);
}
}
以下是 SelectionLoadingStrategy 为虚拟数据存储实现的示例。
Below is an example of SelectionLoadingStrategy implementation for an imaginary datastore.
示例 24. 实现 SelectionLoadingStrategy
. Example 24. Implementing SelectionLoadingStrategy
public class MySelectionLoadingStrategy<E>
implements SelectionLoadingStrategy<E> {
private final Class<E> rootEntityType;
public MySelectionLoadingStrategy(Class<E> rootEntityType) {
this.rootEntityType = rootEntityType;
}
@Override
public SelectionEntityLoader<E> createEntityLoader(
LoadingTypeGroup<E> includedTypes, (1)
SelectionLoadingOptions options) {
MyDatastoreConnection connection =
options.context( MyDatastoreConnection.class ); (2)
return new SelectionEntityLoader<E>() {
@Override
public List<E> load(List<?> identifiers, Deadline deadline) {
return connection.loadEntitiesByIdInSameOrder( (3)
rootEntityType, identifiers );
}
};
}
}
Hibernate Search 将通过将具有相同 SelectionLoadingStrategy 的类型或根据 equals() / hashCode() 相等的具有不同策略的类型分组在一起来优化加载。 |
Hibernate Search will optimize loading by grouping together types that have the same SelectionLoadingStrategy, or different strategies that are equal according to equals()/hashCode(). |
在将类型分组在一起时,只会调用其中一种策略,且它将传递一个“类型组”,其中包括应该加载的所有类型。
When grouping types together, only one of the strategies will be called, and it will get passed a "type group" that includes all types that should be loaded.
在从“父级”实体类型配置了加载绑定器时,当子类型继承了它并为子类型设置了相同的策略时,尤其会发生这种情况。
This happens in particular when configuring the loading binder from a "parent" entity type is inherited by subtypes, and sets the same strategy on subtypes.
一旦所有要搜索的所有类型实现了其选择性加载策略并分配给它们,它们就可以在 querying 时作为命中加载:
Once all types to search for have their selection loading strategy implemented and assigned, they can be loaded as hits when querying:
示例 25. 使用单机 POJO 映射器将实体作为搜索查询命中加载
. Example 25. Loading entities as search query hits with the Standalone POJO Mapper
MyDatastore datastore = /* ... */ (1)
SearchMapping searchMapping = /* ... */ (2)
try ( MyDatastoreConnection connection = datastore.connect(); (3)
SearchSession searchSession = searchMapping.createSessionWithOptions() (4)
.loading( o -> o.context( MyDatastoreConnection.class, connection ) ) (5)
.build() ) { (6)
List<Book> hits = searchSession.search( Book.class ) (7)
.where( f -> f.matchAll() )
.fetchHits( 20 ); (8)
}
10.2.6. Programmatic mapping
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
@SearchEntity 及其相应的程序映射 .searchEntity() 对于 Hibernate ORM 实体是没必要的,并且在使用 Hibernate ORM integration 时事实上也不受支持。
@SearchEntity and its corresponding programmatic mapping .searchEntity() are unnecessary for Hibernate ORM entities, and in fact unsupported when using the Hibernate ORM integration.
请参阅 HSEARCH-5076 以追踪允许在 Hibernate ORM 集成中使用 @SearchEntity 映射非 ORM 实体的进度。
See HSEARCH-5076 to track progress on allowing the use of @SearchEntity in the Hibernate ORM integration to map non-ORM entities.
你也可以通过 programmatic mapping 将类型标记为实体类型。行为和选项与基于注解的映射相同。
You can mark a type as an entity type through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
示例 26. 使用 .searchEntity() 将类型标记为实体类型
. Example 26. Marking a type as an entity type with .searchEntity()
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.searchEntity();
TypeMappingStep authorMapping = mapping.type( Author.class );
authorMapping.searchEntity().name( "MyAuthorName" );
10.3. Entity/index mapping
10.3.1. Basics
要索引一个实体,必须使用 @Indexed 为其添加注释。
In order to index an entity, it must be annotated with @Indexed.
示例 27. 使用 @Indexed 将类标记为已编入索引
. Example 27. Marking a class for indexing with @Indexed
@Entity
@Indexed
public class Book {
子类继承 @Indexed 注释,并且默认情况下还将被编入索引。每个已编入索引的子类将具有其自己的索引,尽管在搜索时这将是透明的 ( all targeted indexes will be queried simultaneously )。 |
Subclasses inherit the @Indexed annotation and will also be indexed by default. Each indexed subclass will have its own index, though this will be transparent when searching (all targeted indexes will be queried simultaneously). |
如果 @Indexed 被继承的事实对您的应用程序造成问题,您可以使用 @Indexed(enabled = false) 对子类进行注释。
If the fact that @Indexed is inherited is a problem for your application, you can annotate subclasses with @Indexed(enabled = false).
默认情况下:
By default:
-
The index name will be equal to the entity name, which in Hibernate ORM is set using the @Entity annotation and defaults to the simple class name.
-
With the Hibernate ORM integration, the identifier of indexed documents will be generated from the entity identifier. Most types commonly used for entity identifiers are supported out of the box, but for more exotic types you may need specific configuration.
使用 Standalone POJO Mapper 时,已索引文档的标识符需要是 mapped explicitly。
With the Standalone POJO Mapper, the identifier of indexed documents needs to be mapped explicitly.
有关详细信息,参见 Mapping the document identifier。
See Mapping the document identifier for details.
-
The index won’t have any field. Fields must be mapped to properties explicitly. See Mapping a property to an index field with @GenericField, @FullTextField, … for details.
10.3.2. Explicit index/backend
可以通过设置 @Indexed(index = …) 更改索引的名称。请注意,索引名称在给定的应用程序中必须唯一。
You can change the name of the index by setting @Indexed(index = …). Note that index names must be unique in a given application.
示例 28. 使用 @Indexed.index 显式指定索引名称
. Example 28. Explicit index name with @Indexed.index
@Entity
@Indexed(index = "AuthorIndex")
public class Author {
如果你 defined named backends,可以将实体映射到除默认后端以外的其他后端。通过设置 @Indexed(backend = "backend2"),会通知 Hibernate Search,你的实体的索引必须在名为“backend2”的后端中创建。如果模型有明确定义且索引要求有很大差异的子部分,这可能有用。
If you defined named backends, you can map entities to another backend than the default one. By setting @Indexed(backend = "backend2") you inform Hibernate Search that the index for your entity must be created in the backend named "backend2". This may be useful if your model has clearly defined sub-parts with very different indexing requirements.
示例 29. 使用 @Indexed.backend 显式指定后端
. Example 29. Explicit backend with @Indexed.backend
@Entity
@Table(name = "\"user\"")
@Indexed(backend = "backend2")
public class User {
由不同后端索引的实体不能成为同一查询的目标。例如,根据以上定义的映射,以下代码将引发异常,因为 Author 和 User 在不同的后端中被索引: |
Entities indexed in different backends cannot be targeted by the same query. For example, with the mappings defined above, the following code will throw an exception because Author and User are indexed in different backends: |
// This will fail because Author and User are indexed in different backends searchSession.search( Arrays.asList( Author.class, User.class ) ) .where( f → f.matchAll() ) .fetchHits( 20 ); // This will fail because Author and User are indexed in different backends searchSession.search( Arrays.asList( Author.class, User.class ) ) .where( f → f.matchAll() ) .fetchHits( 20 );
// This will fail because Author and User are indexed in different backends searchSession.search( Arrays.asList( Author.class, User.class ) ) .where( f → f.matchAll() ) .fetchHits( 20 ); // This will fail because Author and User are indexed in different backends searchSession.search( Arrays.asList( Author.class, User.class ) ) .where( f → f.matchAll() ) .fetchHits( 20 );
10.3.3. Conditional indexing and routing
将实体映射到索引并不总是像 “此实体类型转到此索引” 这么简单。由于多种原因,但主要出于性能原因,您可能希望自定义给定实体在何时和何处编制索引:
The mapping of an entity to an index is not always as straightforward as "this entity type goes to this index". For many reasons, but mainly for performance reasons, you may want to customize when and where a given entity is indexed:
-
You may not want to index all entities of a given type: for example, prevent indexing of entities when their status property is set to DRAFT or ARCHIVED, because users are not supposed to search for those entities.
-
You may want to route entities to a specific shard of the index: for example, route entities based on their language property, because each user has a specific language and only searches for entities in their language.
这些行为可以通过使用 @Indexed(routingBinder = …) 为已编制索引的实体类型分配路由桥梁在 Hibernate Search 中实现。
These behaviors can be implemented in Hibernate Search by assigning a routing bridge to the indexed entity type through @Indexed(routingBinder = …).
有关路由桥的详细信息,参见 Routing bridge。
For more information about routing bridges, see Routing bridge.
10.3.4. Programmatic mapping
你也可以通过 programmatic mapping 将实体标记为已索引。行为和选项与基于注解的映射相同。
You can mark an entity as indexed through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
示例 30. 使用 .indexed() 将类标记为已编入索引
. Example 30. Marking a class for indexing with .indexed()
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
TypeMappingStep authorMapping = mapping.type( Author.class );
authorMapping.indexed().index( "AuthorIndex" );
TypeMappingStep userMapping = mapping.type( User.class );
userMapping.indexed().backend( "backend2" );
10.4. Mapping the document identifier
10.4.1. Basics
索引文档(与实体类似)需要分配一个标识符,以便 Hibernate Search 可以处理更新和删除。
Index documents, much like entities, need to be assigned an identifier so that Hibernate Search can handle updates and deletion.
在 indexing Hibernate ORM entities 中,实体标识符默认用作文档标识符。如果实体标识符有 supported type,标识符映射将立即生效,无需显式映射。
When indexing Hibernate ORM entities, the entity identifier is used as a document identifier by default. Provided the entity identifier has a supported type, identifier mapping will work out of the box and no explicit mapping is necessary.
在使用 Standalone POJO Mapper 时,文档标识符需要是 mapped explicitly。
When using the Standalone POJO Mapper, document identifiers need to be mapped explicitly.
10.4.2. Explicit identifier mapping
在以下情况下需要显式标识符映射:
Explicit identifier mapping is required in the following cases:
-
Hibernate Search doesn’t know about the entity identifier (e.g. when using the Standalone POJO Mapper).
-
OR the document identifier is not the entity identifier.
-
OR the entity identifier has a type that is not supported by default. This is the case of composite identifiers (Hibernate ORM’s @EmbeddedId, @IdClass), in particular.
要选择一个映射到文档标识符的属性,只需对该属性应用 @DocumentId 注释:
To select a property to map to the document identifier, just apply the @DocumentId annotation to that property:
示例 31. 使用 @DocumentId 显式地将属性映射到文档标识符
. Example 31. Mapping a property to the document identifier explicitly with @DocumentId
@Entity
@Indexed
public class Book {
@Id
@GeneratedValue
private Integer id;
@NaturalId
@DocumentId
private String isbn;
public Book() {
}
// Getters and setters
// ...
}
当属性类型不受支持时,还需要 implement a custom identifier bridge,然后在 @DocumentId 注解中引用它:
When the property type is not supported, it is also necessary to implement a custom identifier bridge, then refer to it in the @DocumentId annotation:
示例 32. 使用 @DocumentId 将具有不受支持类型的属性映射到文档标识符
. Example 32. Mapping a property with unsupported type to the document identifier with @DocumentId
@Entity
@Indexed
public class Book {
@Id
@Convert(converter = ISBNAttributeConverter.class)
@DocumentId(identifierBridge = @IdentifierBridgeRef(type = ISBNIdentifierBridge.class))
private ISBN isbn;
public Book() {
}
// Getters and setters
// ...
}
10.4.3. Supported identifier property types
以下是列出所有具有内置标识符桥接器的类型的表格,即在将属性映射到文档标识符时开箱即用支持的属性类型。
Below is a table listing all types with built-in identifier bridges, i.e. property types that are supported out of the box when mapping a property to a document identifier.
该表格还解释了分配给文档标识符的值,即传递给底层后端的价值。
The table also explains the value assigned to the document identifier, i.e. the value passed to the underlying backend.
表 3. 具有内置标识符桥接器的属性类型
Table 3. Property types with built-in identifier bridges
Property type |
Value of document identifiers |
Limitations |
All enum types |
name() as a java.lang.String |
- |
java.lang.String |
Unchanged |
- |
java.lang.Character, char |
A single-character java.lang.String |
- |
java.lang.Byte, byte |
toString() |
- |
java.lang.Short, short |
toString() |
- |
java.lang.Integer, int |
toString() |
- |
java.lang.Long, long |
toString() |
- |
java.lang.Double, double |
toString() |
- |
java.lang.Float, float |
toString() |
- |
java.lang.Boolean, boolean |
toString() |
- |
java.math.BigDecimal |
toString() |
- |
java.math.BigInteger |
toString() |
- |
java.net.URI |
toString() |
- |
java.net.URL |
toExternalForm() |
- |
java.time.Instant |
Formatted according to DateTimeFormatter.ISO_INSTANT. |
- |
java.time.LocalDate |
Formatted according to DateTimeFormatter.ISO_LOCAL_DATE. |
- |
java.time.LocalTime |
Formatted according to DateTimeFormatter.ISO_LOCAL_TIME. |
- |
java.time.LocalDateTime |
Formatted according to DateTimeFormatter.ISO_LOCAL_DATE_TIME. |
- |
java.time.OffsetDateTime |
Formatted according to DateTimeFormatter.ISO_OFFSET_DATE_TIME. |
- |
java.time.OffsetTime |
Formatted according to DateTimeFormatter.ISO_OFFSET_TIME. |
- |
java.time.ZonedDateTime |
Formatted according to DateTimeFormatter.ISO_ZONED_DATE_TIME. |
- |
java.time.ZoneId |
getId() |
- |
java.time.ZoneOffset |
getId() |
- |
java.time.Period |
Formatted according to the ISO 8601 format for a duration (e.g. P1900Y12M21D). |
- |
java.time.Duration |
Formatted according to the ISO 8601 format for a duration, using seconds and nanoseconds only (e.g. PT1.000000123S). |
- |
java.time.Year |
Formatted according to the ISO 8601 format for a Year (e.g. 2017 for 2017 AD, 0000 for 1 BC, -10000 for 10,001 BC, etc.). |
- |
java.time.YearMonth |
Formatted according to the ISO 8601 format for a Year-Month (e.g. 2017-11 for November 2017). |
- |
java.time.MonthDay |
Formatted according to the ISO 8601 format for a Month-Day (e.g. —11-06 for November 6th). |
- |
java.util.UUID |
toString() as a java.lang.String |
- |
java.util.Calendar |
A java.time.ZonedDateTime representing the same date/time and timezone, formatted according to DateTimeFormatter.ISO_ZONED_DATE_TIME. |
See Support for legacy _java.util_ date/time APIs. |
java.util.Date |
Instant.ofEpochMilli(long) as a java.time.Instant formatted according to DateTimeFormatter.ISO_INSTANT. |
See Support for legacy _java.util_ date/time APIs. |
java.sql.Timestamp |
Instant.ofEpochMilli(long) as a java.time.Instant formatted according to DateTimeFormatter.ISO_INSTANT. |
See Support for legacy _java.util_ date/time APIs. |
java.sql.Date |
Instant.ofEpochMilli(long) as a java.time.Instant formatted according to DateTimeFormatter.ISO_INSTANT. |
See Support for legacy _java.util_ date/time APIs. |
java.sql.Time |
Instant.ofEpochMilli(long) as a java.time.Instant, formatted according to DateTimeFormatter.ISO_INSTANT. |
See Support for legacy _java.util_ date/time APIs. |
GeoPoint_ and subtypes_ |
Latitude as double and longitude as double, separated by a comma (e.g. 41.8919, 12.51133). |
- |
10.4.4. Programmatic mapping
你也可以通过 programmatic mapping 映射文档标识符。行为和选项与基于注解的映射相同。
You can map the document identifier through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
示例 33. 使用 .documentId() 显式地将属性映射到文档标识符
. Example 33. Mapping a property to the document identifier explicitly with .documentId()
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "isbn" ).documentId();
10.5. Mapping a property to an index field with @GenericField, @FullTextField, …
10.5.1. Basics
实体的属性可以直接映射到索引字段:您只需添加注释,通过注释属性配置字段,而 Hibernate Search 将负责提取属性值并在必要时填充索引字段。
Properties of an entity can be mapped to an index field directly: you just need to add an annotation, configure the field through the annotation attributes, and Hibernate Search will take care of extracting the property value and populating the index field when necessary.
将属性映射到索引字段如下所示:
Mapping a property to an index field looks like this:
. Example 34. Mapping properties to fields directly
@FullTextField(analyzer = "english", projectable = Projectable.YES) (1)
@KeywordField(name = "title_sort", normalizer = "english", sortable = Sortable.YES) (2)
private String title;
@GenericField(projectable = Projectable.YES, sortable = Sortable.YES) (3)
private Integer pageCount;
在映射一个属性之前,您必须考虑两件事:
Before you map a property, you must consider two things:
The @*Field annotation
最简单的形式中,属性/字段映射是通过将 @GenericField 标注应用于属性来实现的。此标注适用于每种受支持的属性类型,但受到限制:它尤其不允许全文搜索。要更深入,你需要依赖于不同的更具体的标注,这些标注提供特定的属性。可在 Available field annotations 中详细了解可用标注。
In its simplest form, property/field mapping is achieved by applying the @GenericField annotation to a property. This annotation will work for every supported property type, but is rather limited: it does not allow full-text search in particular. To go further, you will need to rely on different, more specific annotations, which offer specific attributes. The available annotations are described in details in Available field annotations.
The type of the property
为了使 @*Field 标注正确工作,Hibernate Search 必须支持映射属性的类型。请参阅 Supported property types 以获取开箱即用支持的所有类型的列表,并参阅 Mapping custom property types 以获取有关如何处理更复杂类型的指示,无论是简单的容器 ( List<String> 、 Map<String, Integer> 、…),还是自定义类型。
In order for the @*Field annotation to work correctly, the type of the mapped property must be supported by Hibernate Search. See Supported property types for a list of all types that are supported out of the box, and Mapping custom property types for indications on how to handle more complex types, be it simply containers (List<String>, Map<String, Integer>, …) or custom types.
10.5.2. Available field annotations
存在各种字段注释,每个注释提供自己的一组属性。
Various field annotations exist, each offering its own set of attributes.
本节列出了不同的注解及其用法。有关可用属性的详细信息,参见 Field annotation attributes。
This section lists the different annotations and their use. For more details about available attributes, see Field annotation attributes.
@GenericField
在内置支持下适用于每种属性类型的一个可靠的默认选择。
A good default choice that will work for every property type with built-in support.
使用此注释映射的字段不提供任何高级功能,例如全文搜索:对通用字段的匹配是精确匹配。
Fields mapped using this annotation do not provide any advanced features such as full-text search: matches on a generic field are exact matches.
@FullTextField
值被视为多个单词的文本字段。仅适用于 String 字段。
A text field whose value is considered as multiple words. Only works for String fields.
全文文本字段上的匹配可以是:匹配包含给定词的字段、匹配不区分大小写的字段、忽略重音符号的匹配字段…。
Matches on a full-text field can be more subtle than exact matches: match fields which contains a given word, match fields regardless of case, match fields ignoring diacritics, …
全文文本字段还允许进行 highlighting。
Full-text fields also allow highlighting.
应该给全文文本字段分配一个 analyzer,通过其名称引用。默认情况下,名为_default_的分析器将被使用。有关分析器和全文文本分析的更多详细信息,请参阅 Analysis。有关如何更改默认分析器的说明,请参阅后端文档中的专用部分: Lucene或 Elasticsearch。
Full-text fields should be assigned an analyzer, referenced by its name. By default, the analyzer named default will be used. See Analysis for more details about analyzers and full-text analysis. For instructions on how to change the default analyzer, see the dedicated section in the documentation of your backend: Lucene or Elasticsearch
注意,你还可以定义 a search analyzer以对搜索词进行不同的分析。
Note you can also define a search analyzer to analyze searched terms differently.
全文字段不能被排序或聚合。如果你需要对属性值排序或聚合,建议使用 @KeywordField ,必要时结合使用一个规范化器(请参阅下文)。请注意,可以将多个字段添加到同一个属性,因此,如果同时需要全文搜索和排序,你可以同时使用 @FullTextField 和 @KeywordField :你只需要为这两个字段中的每一个使用一个不同的 name 。
Full-text fields cannot be sorted on nor aggregated. If you need to sort on, or aggregate on, the value of a property, it is recommended to use @KeywordField, with a normalizer if necessary (see below). Note that multiple fields can be added to the same property, so you can use both @FullTextField and @KeywordField if you need both full-text search and sorting: you will just need to use a distinct name for each of those two fields.
@KeywordField
值被视为单个关键词的文本字段。仅适用于 String 字段。
A text field whose value is considered as a single keyword. Only works for String fields.
关键词字段允许 more subtle matches,类似于全文字段,区别在于关键词字段仅包含一个令牌。另一方面,这一限制允许关键词字段成为 sorted on和 aggregated。
Keyword fields allow more subtle matches, similarly to full-text fields, with the limitation that keyword fields only contain one token. On the other hand, this limitation allows keyword fields to be sorted on and aggregated.
可以给关键字字段分配一个 normalizer,通过其名称引用。有关规范化器和全文文本分析的更多详细信息,请参阅 Analysis。
Keyword fields may be assigned a normalizer, referenced by its name. See Analysis for more details about normalizers and full-text analysis.
@ScaledNumberField
用于整数或浮点值的数字字段,这些值需要比双精度更高的精度,但始终具有大致相同的数量级。仅适用于 java.math.BigDecimal 或 java.math.BigInteger 字段。
A numeric field for integer or floating-point values that require a higher precision than doubles but always have roughly the same scale. Only works for either java.math.BigDecimal or java.math.BigInteger fields.
缩放后的数字被索引为整数,通常是长整数(64 位),并且对于所有文档中字段的所有值都具有固定比例。由于缩放后的数字使用固定精度进行索引,因此它们无法表示所有 BigDecimal 或 BigInteger 值。太大而无法索引的值将触发运行时异常。具有尾随小数位的值将四舍五入到最接近的整数。
Scaled numbers are indexed as integers, typically a long (64 bits), with a fixed scale that is consistent for all values of the field across all documents. Because scaled numbers are indexed with a fixed precision, they cannot represent all BigDecimal or BigInteger values. Values that are too large to be indexed will trigger a runtime exception. Values that have trailing decimal digits will be rounded to the nearest integer.
此注释允许设置 the decimalScale attribute。
This annotation allows to set the decimalScale attribute.
@NonStandardField
用于高级用例的标注,其中使用了 value binder ,而绑定器预计定义一个索引字段类型,该类型不支持任何标准选项: searchable 、 sortable 、…
An annotation for advanced use cases where a value binder is used and that binder is expected to define an index field type that does not support any of the standard options: searchable, sortable, …
当需要后端内生字段类型时,此标注非常有用:对于 Elasticsearch 为 defining the mapping directly as JSON ,对于 Lucene 为 manipulating IndexableField directly 。
This annotation is very useful for cases when a field type native to the backend is necessary: defining the mapping directly as JSON for Elasticsearch, or manipulating IndexableField directly for Lucene.
使用此注解映射的字段只能从注解中配置极少数选项(无 searchable/sortable/等),但值绑定器将能够选取非标准字段类型,这通常提供更大的灵活性。
Fields mapped using this annotation have very limited configuration options from the annotation (no searchable/sortable/etc.), but the value binder will be able to pick a non-standard field type, which generally gives much more flexibility.
@VectorField
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
+ 用于 vector search中的特定字段类型。
+ Specific field type for vector fields to be used in a vector search.
+ 向量场接受 float[] 或 byte[] 类型的值,并且要求预先指定存储向量的 dimension ,并且索引向量大小与该维度匹配。
+ Vector fields accept values of type float[] or byte[] and require that the dimension of stored vectors is specified upfront and that the indexed vectors size match this dimension.
+ 此外,向量场允许选择性配置在搜索期间使用的 similarity function 、在索引期间使用的 efConstruction 和 m 。
+ Besides that, vector fields allow optionally configuring the similarity function used during search, efConstruction and m used during indexing.
+ 警告:与其他字段类型相反,向量字段默认情况下禁用容器提取。手动将 extraction 设置为 _DEFAULT_会导致异常。仅明确 configured extractors 允许用于向量字段。
+ WARNING: Vector fields, on the contrary to the other field types, disable the container extraction by default Manually setting the extraction to DEFAULT will result in an exception. Only explicitly configured extractors are allowed for vector fields.
+ 警告:不允许在同一字段内索引多个向量,即向量字段不能为 multivalued。
+ WARNING: It is not allowed to index multiple vectors within the same field, i.e. vector fields cannot be multivalued.
10.5.3. Field annotation attributes
存在各种字段映射注解,每个注解都提供自己的属性集。
Various field mapping annotations exist, each offering its own set of attributes.
本部分列出了不同的注释属性及其用途。有关可用注释的更多详细信息,请参阅 Available field annotations。
This section lists the different annotation attributes and their use. For more details about available annotations, see Available field annotations.
name
索引字段的名称。默认情况下,它与属性名称相同。当将单个属性映射到多个字段时,你可能需要更改它。
The name of the index field. By default, it is the same as the property name. You may want to change it in particular when mapping a single property to multiple fields.
值:String。名称不得包含点字符 (.)。默认为属性名称。
Value: String. The name must not contain the dot character (.). Defaults to the name of the property.
sortable
该字段是否可 sorted on ,即是否向索引添加特定的数据结构,以便在查询时允许高效排序。
Whether the field can be sorted on, i.e. whether a specific data structure is added to the index to allow efficient sorts when querying.
值:Sortable.YES、Sortable.NO、Sortable.DEFAULT。
Value: Sortable.YES, Sortable.NO, Sortable.DEFAULT.
此选项不适用于 @FullTextField。请参阅 here以获取说明和一些解决方案。
This option is not available for @FullTextField. See here for an explanation and some solutions.
projectable
该字段是否可 projected on ,即字段值是否存储在索引中,以便稍后在查询时进行检索。
Whether the field can be projected on, i.e. whether the field value is stored in the index to allow retrieval later when querying.
值:Projectable.YES、Projectable.NO、Projectable.DEFAULT。
Value: Projectable.YES, Projectable.NO, Projectable.DEFAULT.
Lucene和 Elasticsearch后端的默认值不同:对于 Lucene,默认值为 Projectable.NO,而对于 Elasticsearch,默认值为 Projectable.YES。
The defaults are different for the Lucene and Elasticsearch backends: with Lucene, the default is Projectable.NO, while with Elasticsearch it’s Projectable.YES.
对于 Elasticsearch,如果 projectable_或 _sortable_中的任何属性在 _GeoPoint_字段上解析为 _YES,那么此字段将自动成为 projectable_和 _sortable,即使其中一个显式设置为 _NO_也是如此。
For Elasticsearch if any of projectable or sortable properties are resolved to YES on a GeoPoint field then this field automatically becomes both projectable and sortable even if one of them was explicitly set to NO.
aggregable
该字段是否可 aggregated ,即字段值是否存储在索引中的特定数据结构中,以便在稍后查询时允许聚合。
Whether the field can be aggregated, i.e. whether the field value is stored in a specific data structure in the index to allow aggregations later when querying.
值:Aggregable.YES、Aggregable.NO、Aggregable.DEFAULT。
Value: Aggregable.YES, Aggregable.NO, Aggregable.DEFAULT.
此选项不适用于 @FullTextField。请参阅 here以获取说明和一些解决方案。
This option is not available for @FullTextField. See here for an explanation and some solutions.
searchable
该字段是否可进行搜索。即该字段是否被索引,以便稍后在查询时允许应用谓词。
Whether the field can be searched on. i.e. whether the field is indexed in order to allow applying predicates later when querying.
值:Searchable.YES、Searchable.NO、Searchable.DEFAULT。
Value: Searchable.YES, Searchable.NO, Searchable.DEFAULT.
indexNullAs
在属性值为 null 时用作替换的值。
The value to use as a replacement anytime the property value is null.
默认已禁用。
Disabled by default.
替换被定义为 String。因此,必须对其值进行分析。在 Supported property types查找列 _Parsing method for 'indexNullAs'_以了解解析时使用的格式。
The replacement is defined as a String. Thus, its value has to be parsed. Look up the column Parsing method for 'indexNullAs' in Supported property types to find out the format used when parsing.
extraction
在容器类型 ( List 、 Optional 、 Map 、…) 的情况下,如何从属性中提取要索引的元素。
How elements to index should be extracted from the property in the case of container types (List, Optional, Map, …).
默认情况下,对于具有容器类型的属性,最内层元素将被索引。例如,对于类型为 List<String> 的属性,类型为 String 的元素将被索引。
By default, for properties that have a container type, the innermost elements will be indexed. For example for a property of type List<String>, elements of type String will be indexed.
向量字段默认禁用提取。
Vector fields disable the extraction by default.
此默认行为和如何覆盖它的方法在部分 Mapping container types with container extractors中进行了说明。
This default behavior and ways to override it are described in the section Mapping container types with container extractors.
analyzer
索引和查询时对字段值应用的分析器。只适用于 @FullTextField 。
The analyzer to apply to field values when indexing and querying. Only available on @FullTextField.
默认情况下,将使用名为 default 的分析器。
By default, the analyzer named default will be used.
有关分析器和全文文本分析的更多详细信息,请参阅 Analysis。
See Analysis for more details about analyzers and full-text analysis.
searchAnalyzer
一个可选项的不同的分析器,推翻 analyzer 属性中定义的一个,仅在分析搜索项时使用。
An optional different analyzer, overriding the one defined with the analyzer attribute, to use only when analyzing searched terms.
如果未定义,则会使用分配给 analyzer 的分析器。
If not defined, the analyzer assigned to analyzer will be used.
有关分析器和全文文本分析的更多详细信息,请参阅 Analysis。
See Analysis for more details about analyzers and full-text analysis.
normalizer
在索引和查询时对字段值应用的归一化器。只适用于 @KeywordField 。
The normalizer to apply to field values when indexing and querying. Only available on @KeywordField.
有关规范化器和全文文本分析的更多详细信息,请参阅 Analysis。
See Analysis for more details about normalizers and full-text analysis.
norms
是否应存储字段的索引时评分信息。只适用于 @KeywordField 和 @FullTextField 。
Whether index-time scoring information for the field should be stored or not. Only available on @KeywordField and @FullTextField.
启用规范将提高评分的质量。禁用规范将减少索引使用的磁盘空间。
Enabling norms will improve the quality of scoring. Disabling norms will reduce the disk space used by the index.
值:Norms.YES,Norms.NO,Norms.DEFAULT。
Value: Norms.YES, Norms.NO, Norms.DEFAULT.
termVector
存储词向量的策略。只适用于 @FullTextField 。
The term vector storing strategy. Only available on @FullTextField.
此属性的不同值是:
The different values of this attribute are:
ValueDefinition_TermVector.YES_
存储每个文档的术语向量。这会生成两个同步数组,一个包含文档术语,另一个包含术语的频率。
Store the term vectors of each document. This produces two synchronized arrays, one contains document terms and the other contains the term’s frequency.
TermVector.NO
TermVector.NO
不存储术语向量。
Do not store term vectors.
TermVector.WITH_POSITIONS
TermVector.WITH_POSITIONS
存储术语向量和令牌位置信息。这与 TermVector.YES 相同,此外还包含文档中每个术语出现的顺序位置。
Store the term vector and token position information. This is the same as TermVector.YES plus it contains the ordinal positions of each occurrence of a term in a document.
TermVector.WITH_OFFSETS
TermVector.WITH_OFFSETS
存储术语向量和令牌偏移量信息。这与 TermVector.YES 相同,此外还包含术语的起始和结束偏移量位置信息。
Store the term vector and token offset information. This is the same as TermVector.YES plus it contains the starting and ending offset position information for the terms.
TermVector.WITH_POSITION_OFFSETS
TermVector.WITH_POSITION_OFFSETS
存储术语向量、令牌位置和偏移量信息。这是 YES、WITH_OFFSETS 和 WITH_POSITIONS 的组合。
Store the term vector, token position and offset information. This is a combination of the YES, WITH_OFFSETS and WITH_POSITIONS.
TermVector.WITH_POSITIONS_PAYLOADS
TermVector.WITH_POSITIONS_PAYLOADS
存储术语向量、令牌位置和令牌有效负载。这与 TermVector.WITH_POSITIONS 相同,此外还包含文档中每个术语出现的有效负载。
Store the term vector, token position and token payloads. This is the same as TermVector.WITH_POSITIONS plus it contains the payload of each occurrence of a term in a document.
TermVector.WITH_POSITIONS_OFFSETS_PAYLOADS
TermVector.WITH_POSITIONS_OFFSETS_PAYLOADS
存储术语向量、令牌位置、偏移量信息和令牌有效负载。这与 TermVector.WITH_POSITION_OFFSETS 相同,此外还包含文档中每个术语出现的有效负载。
Store the term vector, token position, offset information and token payloads. This is the same as TermVector.WITH_POSITION_OFFSETS plus it contains the payload of each occurrence of a term in a document.
请注意,全文文本字段中 highlighter types requested可能会影响最终确定的词向量存储策略。因为 rapid 矢量高亮显示器类型在词向量存储策略方面具有 specific requirements,如果明确或隐式请求它,通过使用_Highlightable.ANY_,它将把策略设置为_TermVector.WITH_POSITIONS_OFFSETS_,除非已经指定策略。如果使用了与 rapid 矢量高亮显示器不兼容的非默认策略,将抛出异常。
Note that highlighter types requested by the full-text field might affect the finally resolved term vector storing strategy. Since the fast vector highlighter type has specific requirements regarding the term vector storing strategy, if it is requested explicitly or implicitly through the usage of Highlightable.ANY, it will set the strategy to TermVector.WITH_POSITIONS_OFFSETS unless a strategy was already specified. An exception will be thrown if a non-default strategy that is not compatible with the fast vector highlighter is used.
decimalScale
在大数字 ( BigInteger 或 BigDecimal ) 作为定点整数进行索引之前,如何调整其刻度。只适用于 @ScaledNumberField 。
How the scale of a large number (BigInteger or BigDecimal) should be adjusted before it is indexed as a fixed-precision integer. Only available on @ScaledNumberField.
要索引小数点后有有效数字的数字,请将 decimalScale 设置为您需要索引的数字位数。小数点在索引之前将向右移动相应次数,从而保留小数部分中的相应位数。要索引无法放入长整数的非常大数字,请将小数点设置为负数。小数点在索引之前将向左移动相应次数,从而丢弃小数部分中的所有数字。
To index numbers that have significant digits after the decimal point, set the decimalScale to the number of digits you need indexed. The decimal point will be shifted that many times to the right before indexing, preserving that many digits from the decimal part. To index very large numbers that cannot fit in a long, set the decimal point to a negative value. The decimal point will be shifted that many times to the left before indexing, dropping all digits from the decimal part.
仅 BigDecimal 允许具有严格正值 decimalScale,因为 BigInteger 值没有小数位。
decimalScale with strictly positive values is allowed only for BigDecimal, since BigInteger values have no decimal digits.
请注意,小数点的移动是完全透明的,并且不会影响您使用搜索 DSL 的方式:您需要提供“正常”的 BigDecimal 或 BigInteger 值,而 Hibernate Search 将透明地应用 decimalScale 和舍入。
Note that shifting of the decimal points is completely transparent and will not affect how you use the search DSL: you be expected to provide "normal" BigDecimal or BigInteger values, and Hibernate Search will apply the decimalScale and rounding transparently.
由于舍入,搜索谓词和排序仅将与 decimalScale 允许的值一样精确。
As a result of the rounding, search predicates and sorts will only be as precise as what the decimalScale allows.
请注意,舍入不会影响投影,投影会返回原始值,而不会损失任何精度。
Note that rounding does not affect projections, which will return the original value without any loss of precision.
典型的用例是货币金额,其小数刻度为 2,因为通常小数点后只使用两个数字。使用 Hibernate ORM integration 时,会自动从相关 SQL @Column 的基础 scale 值采用默认 decimalScale ,并使用 Hibernate ORM 元数据。使用 decimalScale 属性可以明确重写该值。
A typical use case is monetary amounts, with a decimal scale of 2 because only two digits are generally needed beyond the decimal point.With the Hibernate ORM integration, a default decimalScale is taken automatically from the underlying scale value of the relative SQL @Column, using the Hibernate ORM metadata. The value could be overridden explicitly using the decimalScale attribute.
highlightable
该字段是否可以为 highlighted ,如果可以,可以对其应用哪些高亮类型。即,字段值是否以特定格式进行索引/存储,以允许在以后查询时高亮。只适用于 @FullTextField 。
Whether the field can be highlighted and if so which highlighter types can be applied to it. I.e. whether the field value is indexed/stored in a specific format to allow highlighting later when querying. Only available on @FullTextField.
虽然在大多数情况下选择一种高亮显示类型就足够了,但此属性可以接受多个不矛盾的值。请参阅 highlighter types section以查看选择哪个高亮显示。可用的值包括:
While for most cases picking one highlighter type should be enough, this attribute can accept multiple, non contradicting values. Please refer to highlighter types section to see which highlighter to select. Available values are:
ValueDefinition_Highlightable.NO_
不允许对该字段进行高亮显示。
Do not allow highlighting on the field.
Highlightable.ANY
Highlightable.ANY
允许对该字段应用任何高亮显示器类型进行高亮显示。
Allow any highlighter type be applied for highlighting the field.
Highlightable.PLAIN
Highlightable.PLAIN
允许将普通高亮类型应用于高亮字段。
Allow the plain highlighter type be applied for highlighting the field.
Highlightable.UNIFIED
Highlightable.UNIFIED
允许将统一高亮类型应用于高亮字段。
Allow the unified highlighter type be applied for highlighting the field.
Highlightable.FAST_VECTOR
Highlightable.FAST_VECTOR
允许将快速向量高亮类型应用于对字段的高亮显示。此高亮显示类型要求一个 term vector storage strategy被设置为 WITH_POSITIONS_OFFSETS_或 _WITH_POSITIONS_OFFSETS_PAYLOADS。
Allow the fast vector highlighter type be applied for highlighting the field. This highlighter type requires a term vector storage strategy to be set to WITH_POSITIONS_OFFSETS or WITH_POSITIONS_OFFSETS_PAYLOADS.
Highlightable.DEFAULT
Highlightable.DEFAULT
使用取决于整体字段配置的后端特定默认值。 Elasticsearch’s默认值是 [Highlightable.PLAIN, Highlightable.UNIFIED]。 Lucene’s默认值取决于为该字段配置的 projectable value。如果该字段是可投影的,则支持 [PLAIN, UNIFIED]_高亮显示。否则,不支持高亮显示 (_Highlightable.NO)。此外,如果 term vector storing strategy被设置为 WITH_POSITIONS_OFFSETS_或 _WITH_POSITIONS_OFFSETS_PAYLOADS,如果后端已支持其他两个 ([PLAIN, UNIFIED]),则两个后端都将支持 _FAST_VECTOR_高亮显示。
Use the backend-specific default that is dependent on an overall field configuration. Elasticsearch’s default value is [Highlightable.PLAIN, Highlightable.UNIFIED]. Lucene’s default value is dependent on the projectable value configured for the field. If the field is projectable then [PLAIN, UNIFIED] highlighters are supported. Otherwise, highlighting is not supported (Highlightable.NO). Additionally, if the term vector storing strategy is set to WITH_POSITIONS_OFFSETS or WITH_POSITIONS_OFFSETS_PAYLOADS, both backends would support the FAST_VECTOR highlighter, if they already support the other two ([PLAIN, UNIFIED]).
dimension
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
+ 存储向量的尺寸。这是一个必填字段。该尺寸应与用于将数据转换为向量表示的模型生成的向量的向量尺寸相符。它应当是一个正整数。后端对允许的最大值有具体规定。对于 Lucene backend,维度必须在 _[1, 4096]_范围内。对于 Elasticsearch backend,范围取决于分布情况。请参阅 Elasticsearch/ OpenSearch 的特定文档来了解这些分布的向量类型。
+ The size of the stored vectors. This is a required field. This size should match the vector size of the vectors produced by the model used to convert the data into vector representation. It is expected to be a positive integer value. Maximum accepted value is backend-specific. For the Lucene backend the dimension must be in [1, 4096] range. As for the Elasticsearch backend the range depends on the distribution. See the Elasticsearch/OpenSearch specific documentation to learn about the vector types of these distributions.
+ 仅在 @VectorField 上可用。
+ Only available on @VectorField.
vectorSimilarity
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
+ 定义在 vector search期间如何计算向量相似度。
+ Defines how vector similarity is calculated during a vector search.
+ 仅在 @VectorField 上可用。
+ Only available on @VectorField.
+
Value |
Definition |
VectorSimilarity.L2 |
An L2 (Euclidean) norm, that is a sensible default for most scenarios. Distance between vectors x and y is calculated as \(d(x,y) = \sqrt{\sum_{i=1}^{n} (x_i - y_i)^2 } \) and the score function is \(s = \frac{1}{1+d^2}\) |
VectorSimilarity.DOT_PRODUCT |
Inner product (dot product in particular). Distance between vectors x and y is calculated as \(d(x,y) = \sum_{i=1}^{n} x_i \cdot y_i \) and the score function is \(s = \frac{1}{1+d}\)To use this similarity efficiently, both index and search vectors must be normalized; otherwise search may produce poor results. Floating point vectors must be normalized to be of unit length, while byte vectors should simply all have the same norm. |
To use this similarity efficiently, both index and search vectors must be normalized; otherwise search may produce poor results. Floating point vectors must be normalized to be of unit length, while byte vectors should simply all have the same norm. |
|
VectorSimilarity.COSINE |
Cosine similarity. Distance between vectors x and y is calculated as \(d(x,y) = \frac{1 - \sum_{i=1} ^{n} x_i \cdot y_i }{ \sqrt{ \sum_{i=1} ^{n} x_i^2 } \sqrt{ \sum_{i=1} ^{n} y_i^2 }} \) and the score function is \(s = \frac{1}{1+d}\) |
VectorSimilarity.MAX_INNER_PRODUCT |
Similar to a dot product similarity, but does not require vector normalization. Distance between vectors x and y is calculated as \(d(x,y) = \sum_{i=1}^{n} x_i \cdot y_i \) and the score function is \(s = \begin{cases} \frac{1}{1-d} & \text{if d < 0}\\ d+1 & \text{otherwise} \end{cases} \) |
VectorSimilarity.DEFAULT |
Use the backend-specific default. For the Lucene backend an L2 similarity is used. |
+ 矢量相似性如何匹配后端特定的值
+ .How the vector similarity matches to a backend-specific value
Hibernate Search Value |
Lucene Backend |
Elasticsearch Backend |
Elasticsearch Backend (OpenSearch distribution) |
DEFAULT |
EUCLIDEAN |
Elasticsearch default |
OpenSearch default. |
L2 |
EUCLIDEAN |
l2_norm |
l2 |
DOT_PRODUCT |
DOT_PRODUCT |
dot_product |
currently not supported by OpenSearch and will result in an exception. |
COSINE |
COSINE |
cosine |
cosinesimil |
MAX_INNER_PRODUCT |
MAXIMUM_INNER_PRODUCT |
max_inner_product |
currently not supported by OpenSearch and will result in an exception. |
efConstruction
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
+ efConstruction 是在 k-NN 图表创建期间使用的动态列表的尺寸。它会影响矢量的存储方式。更高的值会导致图表更加准确,但索引速度较慢。
+ efConstruction is the size of the dynamic list used during k-NN graph creation. It affects how vectors are stored. Higher values lead to a more accurate graph but slower indexing speed.
+ 默认值特定于后端。
+ Default value is backend-specific.
+ 仅在 @VectorField 上可用。
+ Only available on @VectorField.
m
以下列出的特性尚处于 incubating 阶段:它们仍在积极开发中。
Features detailed below are incubating: they are still under active development.
通常 compatibility policy 不适用:孵化元素(例如类型、方法、配置属性等)的契约在后续版本中可能会以向后不兼容的方式更改,甚至可能被移除。
The usual compatibility policy does not apply: the contract of incubating elements (e.g. types, methods, configuration properties, etc.) may be altered in a backward-incompatible way — or even removed — in subsequent releases.
我们建议您使用孵化特性,以便开发团队可以收集反馈并对其进行改进,但在需要时您应做好更新依赖于这些特性的代码的准备。
You are encouraged to use incubating features so the development team can get feedback and improve them, but you should be prepared to update code which relies on them as needed.
+ HNSW (Hierarchical Navigable Small World graphs) graph 中每个节点将连接到的邻居数。修改此值会对内存消耗产生影响。建议将此值保持在 2 到 100 之间。
+ The number of neighbors each node will be connected to in the HNSW (Hierarchical Navigable Small World graphs) graph. Modifying this value will have an impact on memory consumption. It is recommended to keep this value between 2 and 100.
+ 默认值特定于后端。
+ Default value is backend-specific.
+ 仅在 @VectorField 上可用。
+ Only available on @VectorField.
10.5.4. Supported property types
下表列出了内置值桥接器的所有类型,即在将属性映射到索引字段时开箱即用支持的属性类型。
Below is a table listing all types with built-in value bridges, i.e. property types that are supported out of the box when mapping a property to an index field.
该表还解释了分配给索引字段的值,即传递给底层后端以进行索引的值。
The table also explains the value assigned to the index field, i.e. the value passed to the underlying backend for indexing.
有关后端使用的基础索引和存储的信息,请参阅 Lucene field types 或 Elasticsearch field types(取决于您的后端)。 |
For information about the underlying indexing and storage used by the backend, see Lucene field types or Elasticsearch field types depending on your backend. |
表 4. 具有内置值桥接器的属性类型
Table 4. Property types with built-in value bridges
Property type |
Value of index field (if different) |
Limitations |
Parsing method for 'indexNullAs'/terms in query string predicates |
All enum types |
name() as a java.lang.String |
- |
Enum.valueOf(String) |
java.lang.String |
- |
- |
- |
java.lang.Character, char |
A single-character java.lang.String |
- |
Accepts any single-character java.lang.String |
java.lang.Byte, byte |
- |
- |
Byte.parseByte(String) |
java.lang.Short, short |
- |
- |
Short.parseShort(String) |
java.lang.Integer, int |
- |
- |
Integer.parseInt(String) |
java.lang.Long, long |
- |
- |
Long.parseLong(String) |
java.lang.Double, double |
- |
- |
Double.parseDouble(String) |
java.lang.Float, float |
- |
- |
Float.parseFloat(String) |
java.lang.Boolean, boolean |
- |
- |
Accepts the strings true or false, ignoring case |
java.math.BigDecimal |
- |
- |
new BigDecimal(String) |
java.math.BigInteger |
- |
- |
new BigInteger(String) |
java.net.URI |
toString() as a java.lang.String |
- |
new URI(String) |
java.net.URL |
toExternalForm() as a java.lang.String |
- |
new URL(String) |
java.time.Instant |
- |
Instant.parse(String) |
|
java.time.LocalDate |
- |
LocalDate.parse(String). |
|
java.time.LocalTime |
- |
LocalTime.parse(String) |
|
java.time.LocalDateTime |
- |
LocalDateTime.parse(String) |
|
java.time.OffsetDateTime |
- |
OffsetDateTime.parse(String) |
|
java.time.OffsetTime |
- |
OffsetTime.parse(String) |
|
java.time.ZonedDateTime |
- |
ZonedDateTime.parse(String) |
|
java.time.ZoneId |
getId() as a java.lang.String |
- |
ZoneId.of(String) |
java.time.ZoneOffset |
getTotalSeconds() as a java.lang.Integer |
- |
ZoneOffset.of(String) |
java.time.Period |
A formatted java.lang.String: <years on 11 characters><months on 11 characters><days on 11 characters> |
- |
Period.parse(String) |
java.time.Duration |
toNanos() as a java.lang.Long |
Duration.parse(String) |
|
java.time.Year |
- |
Year.parse(String) |
|
java.time.YearMonth |
- |
YearMonth.parse(String) |
|
java.time.MonthDay |
- |
- |
MonthDay.parse(String) |
java.util.UUID |
toString() as a java.lang.String |
- |
UUID.fromString(String) |
java.util.Calendar |
A java.time.ZonedDateTime representing the same date/time and timezone. |
See Support for legacy _java.util_ date/time APIs. |
ZonedDateTime.parse(String) |
java.util.Date |
Instant.ofEpochMilli(long) as a java.time.Instant. |
See Support for legacy _java.util_ date/time APIs. |
Instant.parse(String) |
java.sql.Timestamp |
Instant.ofEpochMilli(long) as a java.time.Instant. |
See Support for legacy _java.util_ date/time APIs. |
Instant.parse(String) |
java.sql.Date |
Instant.ofEpochMilli(long) as a java.time.Instant. |
See Support for legacy _java.util_ date/time APIs. |
Instant.parse(String) |
java.sql.Time |
Instant.ofEpochMilli(long) as a java.time.Instant. |
See Support for legacy _java.util_ date/time APIs. |
Instant.parse(String) |
GeoPoint_ and subtypes_ |
- |
- |
Latitude as double and longitude as double, separated by a comma. Ex: 41.8919, 12.51133. |
日期/时间字段的范围和分辨率除了少数例外,大多数日期和时间值都原样传递到后端;例如, LocalDateTime 属性将作为 LocalDateTime 传递到后端。 |
Range and resolution of date/time fields With a few exceptions, most date and time values are passed as-is to the backend; e.g. a LocalDateTime property would be passed as a LocalDateTime to the backend. |
然而,在内部,Lucene 和 Elasticsearch 后端会使用日期/时间类型的不同表示方法。因此,存储在索引中的日期和时间字段的范围和分辨率可能小于相应的 Java 类型。
Internally, however, the Lucene and Elasticsearch backend use a different representation of date/time types. As a result, date and time fields stored in the index may have a smaller range and resolution than the corresponding Java type.
每个后端的文档都提供了更多信息:请参阅 here for Lucene 和 here for Elasticsearch 。
The documentation of each backend provides more information: see here for Lucene and here for Elasticsearch.
10.5.5. Support for legacy java.util date/time APIs
不建议使用 java.util.Calendar 、 java.util.Date 、 java.sql.Timestamp 、 java.sql.Date 、 java.sql.Time 等旧版日期/时间类型,因为它们有很多怪异之处且存在不足之处。一般来说,应该优先考虑在 Java 8 中引入的 java.time 包。
Using legacy date/time types such as java.util.Calendar, java.util.Date, java.sql.Timestamp, java.sql.Date, java.sql.Time is not recommended, due to their numerous quirks and shortcomings. The java.time package introduced in Java 8 should generally be preferred.
话虽如此,集成约束可能会迫使您依赖于旧版日期/时间 API,这就是为什么 Hibernate Search 仍会尽力支持它们。
That being said, integration constraints may force you to rely on the legacy date/time APIs, which is why Hibernate Search still attempts to support them on a best effort basis.
由于 Hibernate Search 使用 java.time API 在内部表示日期/时间,因此在可以索引过时的日期/时间类型之前需要对其进行转换。Hibernate Search 保持简单:java.util.Date,java.util.Calendar 等将使用其时间值(自纪元以来的毫秒数)进行转换,假定它在 Java 8 API 中表示相同的日期/时间。在 java.util.Calendar 的情况下,时区信息将保留用于投影。
Since Hibernate Search uses the java.time APIs to represent date/time internally, the legacy date/time types need to be converted before they can be indexed. Hibernate Search keeps things simple: java.util.Date, java.util.Calendar, etc. will be converted using their time-value (number of milliseconds since the epoch), which will be assumed to represent the same date/time in Java 8 APIs. In the case of java.util.Calendar, timezone information will be preserved for projections.
对于 1900 年后的所有日期,这将按预期的那样工作。
For all dates after 1900, this will work exactly as expected.
在 1900 年之前,通过 Hibernate Search API 索引和搜索也会像预期的那样工作,但是如果您需要以本机方式访问索引,例如通过对 Elasticsearch 服务器进行直接 HTTP 调用,您会注意到索引值略有“偏差”。这是由于 java.time 和旧版日期/时间 API 的实现不同,导致在解释时间值(自历元以来的毫秒数)时出现细微差异。
Before 1900, indexing and searching through Hibernate Search APIs will also work as expected, but if you need to access the index natively, for example through direct HTTP calls to an Elasticsearch server, you will notice that the indexed values are slightly "off". This is caused by differences in the implementation of java.time and legacy date/time APIs which lead to slight differences in the interpretation of time-values (number of milliseconds since the epoch).
“偏移量”是一致的:在构建谓词时它们也将发生,并且在投影时将向相反方向发生。结果,仅依赖于 Hibernate Search API 的应用程序将看不到差异。然而,原生访问索引时会看到差异。
The "drifts" are consistent: they will also happen when building a predicate, and they will happen in the opposite direction when projecting. As a result, the differences will not be visible from an application relying on the Hibernate Search APIs exclusively. They will, however, be visible when accessing indexes natively.
绝大多数用例中,这不会带来问题。如果此行为对于您的应用程序不可接受,然后应该考虑实现自定义 value bridges 并指示 Hibernate Search 默认为 java.util.Date, java.util.Calendar 等使用它们:请参阅 Assigning default bridges with the bridge resolver。
For the large majority of use cases, this will not be a problem. If this behavior is not acceptable for your application, you should look into implementing custom value bridges and instructing Hibernate Search to use them by default for java.util.Date, java.util.Calendar, etc.: see Assigning default bridges with the bridge resolver.
从技术上讲,转换很难,因为 java.time API 和旧版日期/时间 API 的内部日历不同。 |
Technically, conversions are difficult because the java.time APIs and the legacy date/time APIs do not have the same internal calendar. |
具体来说:
In particular:
_java.time_假设 1900 年之前的“地方平太阳时间”,而旧式日期/时间 API 不支持它( JDK-6281408)。结果,对于 1900 年之前的日期,两个 API 报告的时间值(从历元起的毫秒数)将不同。
java.time assumes a "Local Mean Time" before 1900, while legacy date/time APIs do not support it (JDK-6281408), As a result, time values (number of milliseconds since the epoch) reported by the two APIs will be different for dates before 1900.
_java.time_在 1582 年 10 月 15 日之前使用前格里历,这意味着它表现得好像格里历及其闰年系统始终存在一样。另一方面,旧式日期/时间 API 在该日期之前使用儒略历(默认情况下),这意味着闰年并不完全相同。结果,一个 API 认为有效的某些日期将被另一个 API 视为无效,例如 1500 年 2 月 29 日。
java.time uses a proleptic Gregorian calendar before October 15, 1582, meaning it acts as if the Gregorian calendar, along with its system of leap years, had always existed. Legacy date/time APIs, on the other hand, use the Julian calendar before that date (by default), meaning the leap years are not exactly the same ones. As a result, some dates that are deemed valid by one API will be deemed invalid by the other, for example February 29, 1500.
这是两个主要问题,但可能还有其他问题。
Those are the two main problems, but there may be others.
10.5.6. Mapping custom property types
甚至非 supported out of the box类型也能映射。解决方案多种多样,有的简单,有的功能更强大,但它们都归结为从不支持的类型中提取数据,并将其转换为后端支持的类型。
Even types that are not supported out of the box can be mapped. There are various solutions, some simple and some more powerful, but they all come down to extracting data from the unsupported type and converting it to types that are supported by the backend.
有两种情况需要区分:
There are two cases to distinguish between:
-
If the unsupported type is simply a container (List<String>) or multiple nested containers (Map<Integer, List<String>>) whose elements have a supported type, then what you need is a container extractor. See Mapping container types with container extractors for more information.
-
Otherwise, you will have to rely on a custom component, called a bridge, to extract data from your type. See Binding and bridges for more information on custom bridges.
10.5.7. Programmatic mapping
你也可以使用 programmatic mapping直接将实体的属性映射到索引字段。行为和选项与基于注释的映射相同。
You can map properties of an entity to an index field directly through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
. Example 35. Mapping properties to fields directly with .genericField(), .fullTextField(), …
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "title" )
.fullTextField()
.analyzer( "english" ).projectable( Projectable.YES )
.keywordField( "title_sort" )
.normalizer( "english" ).sortable( Sortable.YES );
bookMapping.property( "pageCount" )
.genericField().projectable( Projectable.YES ).sortable( Sortable.YES );
10.6. Mapping associated elements with @IndexedEmbedded
10.6.1. Basics
仅使用 @Indexed 与 @*Field 注解相结合即可对实体及其直接属性进行索引,这种方式简单易用。实际模型将包含多个对象类型,相互之间持有引用,如以下示例中的 authors 关联。
Using only @Indexed combined with @*Field annotations allows indexing an entity and its direct properties, which is nice but simplistic. A real-world model will include multiple object types holding references to one another, like the authors association in the example below.
. Example 36. A multi-entity model with associations
@Entity
@Indexed (1)
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english") (2)
private String title;
@ManyToMany
private List<Author> authors = new ArrayList<>(); (3)
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class Author {
@Id
private Integer id;
private String name;
@ManyToMany(mappedBy = "authors")
private List<Book> books = new ArrayList<>();
public Author() {
}
// Getters and setters
// ...
}
在搜索书籍时,用户可能需要按作者姓名进行搜索。在高性能索引领域,跨索引联接代价高昂,通常不是一种选择。解决此类用例的最佳方式通常是复制数据:在索引 Book 时,只需将所有作者的姓名复制到 Book 文档中。
When searching for a book, users will likely need to search by author name. In the world of high-performance indexes, cross-index joins are costly and usually not an option. The best way to address such use cases is generally to copy data: when indexing a Book, just copy the name of all its authors into the Book document.
这就是 @IndexedEmbedded 的作用:它指示 Hibernate Search 将关联对象的字段 embed 到主对象中。在以下示例中,它将指示 Hibernate Search 将 Author 中定义的 name 字段嵌入到 Book 中,创建 authors.name 字段。
That’s what @IndexedEmbedded does: it instructs Hibernate Search to embed the fields of an associated object into the main object. In the example below, it will instruct Hibernate Search to embed the name field defined in Author into Book, creating the field authors.name.
@IndexedEmbedded 可以用在 Hibernate ORM 的 @Embedded 属性以及关联(@OneToOne、@OneToMany、@ManyToMany、……)上。 |
@IndexedEmbedded can be used on Hibernate ORM’s @Embedded properties as well as associations (@OneToOne, @OneToMany, @ManyToMany, …). |
. Example 37. Using @IndexedEmbedded to index associated elements
@Entity
@Indexed
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String title;
@ManyToMany
@IndexedEmbedded (1)
private List<Author> authors = new ArrayList<>();
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class Author {
@Id
private Integer id;
@FullTextField(analyzer = "name") (2)
private String name;
@ManyToMany(mappedBy = "authors")
private List<Book> books = new ArrayList<>();
public Author() {
}
// Getters and setters
// ...
}
Document identifiers 不是索引字段。因此, @IndexedEmbedded 将忽略它们。 |
Document identifiers are not index fields. Consequently, they will be ignored by @IndexedEmbedded. |
若要使用 @IndexedEmbedded 嵌入另一个实体的标识符,请使用 @GenericField 或其他 @*Field 注解将该标识符显式映射到字段中。
To embed another entity’s identifier with @IndexedEmbedded, map that identifier to a field explicitly using @GenericField or another @*Field annotation.
当 @IndexedEmbedded 应用于关联(即引用实体的属性(如上例))时,关联必须是双向的。否则,Hibernate Search 会在启动时抛出异常。
When @IndexedEmbedded is applied to an association, i.e. to a property that refers to entities (like the example above), the association must be bidirectional. Otherwise, Hibernate Search will throw an exception on startup.
请参阅 Reindexing when embedded elements change 以了解此限制背后的原因以及规避它的方法。
See Reindexing when embedded elements change for the reasons behind this restriction and ways to circumvent it.
可以在多个级别上嵌套索引嵌入;例如,你可以决定对作者的出生地的索引嵌入,以便可以专门搜索由俄罗斯作者编写的书籍:
Index-embedding can be nested on multiple levels; for example you can decide to index-embed the place of birth of authors, to be able to search for books written by Russian authors exclusively:
. Example 38. Nesting multiple @IndexedEmbedded
@Entity
@Indexed
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String title;
@ManyToMany
@IndexedEmbedded (1)
private List<Author> authors = new ArrayList<>();
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class Author {
@Id
private Integer id;
@FullTextField(analyzer = "name") (2)
private String name;
@Embedded
@IndexedEmbedded (3)
private Address placeOfBirth;
@ManyToMany(mappedBy = "authors")
private List<Book> books = new ArrayList<>();
public Author() {
}
// Getters and setters
// ...
}
@Embeddable
public class Address {
@FullTextField(analyzer = "name") (4)
private String country;
private String city;
private String street;
public Address() {
}
// Getters and setters
// ...
}
默认情况下, @IndexedEmbedded 将嵌套在索引嵌入类型中遇到的其他 @IndexedEmbedded 中,递归且无任何限制,这可能会导致无限递归。
By default, @IndexedEmbedded will nest other @IndexedEmbedded encountered in the indexed-embedded type recursively, without any sort of limit, which can cause infinite recursion.
To address this, see Filtering embedded fields and breaking @IndexedEmbedded cycles.
10.6.2. @IndexedEmbedded and null values
当 @IndexedEmbedded 针对的属性包含 null 元素时,这些元素根本不会被索引。
When properties targeted by an @IndexedEmbedded contain null elements, these elements are simply not indexed.
与 Mapping a property to an index field with @GenericField, @FullTextField, … 相反,没有 indexNullAs 来为 null 对象建立特定值的索引,但是,你可以利用搜索查询中的 exists 谓词,在某个 @IndexedEmbedded 有无值的文档中查找:只需将对象字段的名称传递给 exists 谓词,例如上面的 authors 。
On contrary to Mapping a property to an index field with @GenericField, @FullTextField, …, there is no indexNullAs feature to index a specific value for null objects, but you can take advantage of the exists predicate in search queries to look for documents where a given @IndexedEmbedded has or doesn’t have a value: simply pass the name of the object field to the exists predicate, for example authors in the example above.
10.6.3. @IndexedEmbedded on container types
当 @IndexedEmbedded 针对的属性具有容器类型(List、Optional、Map、…)时,将嵌入最内层元素。例如,对于类型为 List<MyEntity> 的属性,将嵌入类型为 MyEntity 的元素。
When properties targeted by an @IndexedEmbedded have a container type (List, Optional, Map, …), the innermost elements will be embedded. For example for a property of type List<MyEntity>, elements of type MyEntity will be embedded.
此默认行为和如何覆盖它的方法在部分 Mapping container types with container extractors中进行了说明。
This default behavior and ways to override it are described in the section Mapping container types with container extractors.
10.6.4. Setting the object field name with name
默认情况下,@IndexedEmbedded 将创建与带注解的属性同名的对象字段,并将嵌入字段添加到该对象字段。因此,如果将 @IndexedEmbedded 应用于 Book 实体中的名为 authors 的属性,则在索引 Book 时,作者的 name 索引字段将被复制到 authors.name 索引字段。
By default, @IndexedEmbedded will create an object field with the same name as the annotated property, and will add embedded fields to that object field. So if @IndexedEmbedded is applied to a property named authors in a Book entity, the index field name of the authors will be copied to the index field authors.name when Book is indexed.
可以通过设置 name 属性来更改对象字段的名称;例如在上述示例中使用 @IndexedEmbedded(name = "allAuthors") 将导致作者的名称被复制到 allAuthors.name 索引字段,而不是 authors.name。
It is possible to change the name of the object field by setting the name attribute; for example using @IndexedEmbedded(name = "allAuthors") in the example above will result in the name of authors being copied to the index field allAuthors.name instead of authors.name.
名称不得包含点字符(.)。 |
The name must not contain the dot character (.). |
10.6.5. Setting the field name prefix with prefix
@IndexedEmbedded 中的 prefix 属性已被废弃,并最终将被删除。请改用 name 。
The prefix attribute in @IndexedEmbedded is deprecated and will ultimately be removed. Use name instead.
默认情况下,@IndexedEmbedded 会用它所应用属性的名称和一个点来缀注嵌入字段的名称。因此,如果在 Book 实体中将 @IndexedEmbedded 应用于名为 authors 的属性,则当对 Book 进行索引时,@{7} 字段的作者将被复制到 authors.name 字段中。
By default, @IndexedEmbedded will prepend the name of embedded fields with the name of the property it is applied to followed by a dot. So if @IndexedEmbedded is applied to a property named authors in a Book entity, the name field of the authors will be copied to the authors.name field when Book is indexed.
可以通过设置 prefix 属性来更改此前缀,例如 @IndexedEmbedded(prefix = "author.")(不要忘记结尾的点!)。
It is possible to change this prefix by setting the prefix attribute, for example @IndexedEmbedded(prefix = "author.") (do not forget the trailing dot!).
前缀通常应为以单点结尾的非点序列,例如 my_Property. 。
The prefix should generally be a sequence of non-dots ending with a single dot, for example my_Property..
将前缀更改为结尾不包含任何点的字符串( my_Property ),或包含点但不在末尾的字符串( my.Property. ),将导致复杂、未记录的旧版行为。自行承担风险。
Changing the prefix to a string that does not include any dot at the end (my_Property), or that includes a dot anywhere but at the very end (my.Property.), will lead to complex, undocumented, legacy behavior. Do this at your own risk.
特别是,前缀不以点结尾将导致 some APIs exposed to custom bridges 出现不正确行为:接受字段名称的 addValue / addObject 方法。
In particular, a prefix that does not end with a dot will lead to incorrect behavior in some APIs exposed to custom bridges: the addValue/addObject methods that accept a field name.
10.6.6. Casting the target of @IndexedEmbedded with targetType
默认情况下,自动使用反射检测已索引嵌入值类型,如有必要,还会考虑 container extraction;例如,将检测 @IndexedEmbedded List<MyEntity> 的值为类型 MyEntity。将从值的类型及其超类型的映射中推断要嵌入的字段;在示例中,将会考虑在 MyEntity 及其超类上显示的 @GenericField 注释,但这将忽略在其子类中定义的注释。
By default, the type of indexed-embedded values is detected automatically using reflection, taking into account container extraction if relevant; for example @IndexedEmbedded List<MyEntity> will be detected as having values of type MyEntity. Fields to be embedded will be inferred from the mapping of the value type and its supertypes; in the example, @GenericField annotations present on MyEntity and its superclasses will be taken into account, but annotations defined in its subclasses will be ignored.
如果由于某种原因,模式未公开属性的正确类型(例如,原始 List 或 List<MyEntityInterface> 而不是 List<MyEntityImpl>),可以通过在 @IndexedEmbedded 中设置 targetType 属性来定义预期的值类型。在引导期间,Hibernate Search 随后将基于给定的目标类型解析要嵌入的字段,并在运行时将值强制转换为给定的目标类型。
If for some reason a schema does not expose the correct type for a property (e.g. a raw List, or List<MyEntityInterface> instead of List<MyEntityImpl>) it is possible to define the expected type of values by setting the targetType attribute in @IndexedEmbedded. On bootstrap, Hibernate Search will then resolve fields to be embedded based on the given target type, and at runtime it will cast values to the given target type.
将已编入索引的嵌入式值强制转换为指定类型的失败将传播出去并导致索引失败。
Failures to cast indexed-embedded values to the designated type will be propagated and lead to indexing failure.
10.6.7. Reindexing when embedded elements change
当“嵌入”实体发生更改时,Hibernate Search 将处理“嵌入”实体的重新索引。
When the "embedded" entity changes, Hibernate Search will handle reindexing of the "embedding" entity.
只要关联 @IndexedEmbedded 应用于双向(使用 Hibernate ORM 的 mappedBy),这种方式在大多数情况下都将透明地进行。
This will work transparently most of the time, as long as the association @IndexedEmbedded is applied to is bidirectional (uses Hibernate ORM’s mappedBy).
当 Hibernate Search 无法处理关联时,它将在引导时引发异常。如果发生这种情况,请参阅 Basics了解更多信息。
When Hibernate Search is unable to handle an association, it will throw an exception on bootstrap. If this happens, refer to Basics to know more.
10.6.8. Embedding the entity identifier
在已索引嵌入类型中将属性映射为 identifier不会在该类型上使用_@IndexedEmbedded_时自动嵌入,因为文档标识符不是字段。
Mapping a property as an identifier in the indexed-embedded type will not automatically result into it being embedded when using @IndexedEmbedded on that type, because document identifiers are not fields.
若要内嵌此类属性的数据,可使用 @IndexedEmbedded(includeEmbeddedObjectId = true),而它将使 Hibernate 搜索在生成的内嵌对象中自动插入一个字段,用于索引内嵌类型 的 identifier property。
To embed the data of such a property, you can use @IndexedEmbedded(includeEmbeddedObjectId = true), which will have Hibernate Search automatically insert a field in the resulting embedded object for the indexed-embedded type’s identifier property.
索引字段将被定义为好像将以下 field annotation 放置在嵌入式类型的标识符的属性上: @GenericField(searchable = Searchable.YES, projectable = Projectable.YES) 。索引字段的名称将是标识符属性的名称。默认情况下,其桥将是嵌入式类型的 @DocumentId annotation 引用的标识符桥(如果存在),或者标识符属性类型的默认值桥。
The index field will be defined as if the following field annotation was put on the identifier property of the embedded type: @GenericField(searchable = Searchable.YES, projectable = Projectable.YES). The name of the index field will be the name of the identifier property. Its bridge will be the identifier bridge referenced by the embedded type’s @DocumentId annotation, if any, or the default value bridge for the identifier property type’s, by default.
如果你需要更多高级映射(自定义名称、自定义桥、可排序的等),请勿使用 includeEmbeddedObjectId 。 |
If you need more advanced mapping (custom name, custom bridge, sortable, …), do not use includeEmbeddedObjectId. |
相反,通过使用 @GenericField or a similar field annotation 注释标识符属性,并在索引嵌入式类型中明确定义该字段,并确保该字段由 configuring filters as necessary 中的 @IndexedEmbedded 包含。
Instead, define the field explicitly in the indexed-embedded type by annotating the identifier property with @GenericField or a similar field annotation, and make sure the field is included by @IndexedEmbedded by configuring filters as necessary.
下面是使用 includeEmbeddedObjectId 的示例:
Below is an example of using includeEmbeddedObjectId:
示例 39. 使用 includeEmbeddedObjectId 包含索引嵌入式 ID
. Example 39. Including indexed-embedded IDs with includeEmbeddedObjectId
@Entity
public class Department {
@Id
private Integer id; (1)
@FullTextField
private String name;
@OneToMany(mappedBy = "department")
private List<Employee> employees = new ArrayList<>();
// Getters and setters
// ...
}
@Entity
@Indexed
public class Employee {
@Id
private Integer id;
@FullTextField
private String name;
@ManyToOne
@IndexedEmbedded(includeEmbeddedObjectId = true) (1)
private Department department;
// Getters and setters
// ...
}
10.6.9. Filtering embedded fields and breaking @IndexedEmbedded cycles
默认情况下,@IndexedEmbedded 将“嵌入”一切:索引嵌入元素中遇到的每个字段,以及递归地遇到索引嵌入元素中的每个 @IndexedEmbedded。
By default, @IndexedEmbedded will "embed" everything: every field encountered in the indexed-embedded element, and every @IndexedEmbedded encountered in the indexed-embedded element, recursively.
对于简单的用例,这将非常适用,但对于更复杂的模型可能会导致问题:
This will work just fine for simpler use cases, but may lead to problems for more complex models:
-
If the indexed-embedded element declares many index fields (Hibernate Search fields), only some of which are actually useful to the "index-embedding" type, the extra fields will decrease indexing performance needlessly.
-
If there is a cycle of @IndexedEmbedded (e.g. A index-embeds b of type B, which index-embeds a of type A) the index-embedding type will end up with an infinite amount of fields (a.b.someField, a.b.a.b.someField, a.b.a.b.a.b.someField, …), which Hibernate Search will detect and reject with an exception.
要解决这些问题,可以过滤待嵌入的字段,仅包括实际有用的字段。@IndexedEmbedded 中可用的过滤属性为:
To address these problems, it is possible to filter the fields to embed, to only include those that are actually useful. Available filtering attributes on @IndexedEmbedded are:
includePaths
应嵌入的索引嵌入式元素的索引字段的路径。
The paths of index fields from the indexed-embedded element that should be embedded.
Provided paths must be relative to the indexed-embedded element, i.e. they must not include its name or prefix.
这优先于 includeDepth(见下文)。
This takes precedence over includeDepth (see below).
不能与同一 @IndexedEmbedded 中的 excludePaths 结合使用。
Cannot be used in combination with excludePaths in the same @IndexedEmbedded.
excludePaths
不得嵌入的索引嵌入式元素的索引字段的路径。
The paths of index fields from the indexed-embedded element that must not be embedded.
Provided paths must be relative to the indexed-embedded element, i.e. they must not include its name or prefix.
这优先于 includeDepth(见下文)。
This takes precedence over includeDepth (see below).
不能与同一 @IndexedEmbedded 中的 includePaths 结合使用。
Cannot be used in combination with includePaths in the same @IndexedEmbedded.
includeDepth
默认情况下,将包含所有字段的所有级别的索引嵌入式。
The number of levels of indexed-embedded that will have all their fields included by default.
includeDepth 是将遍历的 @IndexedEmbedded 的数量,即使未通过 includePaths 显式包含这些字段(除非通过 excludePaths 显式排除这些字段),索引嵌入元素的所有字段也将被包含:
includeDepth is the number of @IndexedEmbedded that will be traversed and for which all fields of the indexed-embedded element will be included, even if these fields are not included explicitly through includePaths, unless these fields are excluded explicitly through excludePaths:
includeDepth=0 表示不会包含此索引嵌入式元素的字段,也不会包含任何嵌套索引嵌入式元素的字段,除非通过 includePaths 明确包含这些字段。
includeDepth=0 means that fields of this indexed-embedded element are not included, nor is any field of nested indexed-embedded elements, unless these fields are included explicitly through includePaths.
includeDepth=1 表示会包含此索引嵌入式元素的字段,除非通过 excludePaths 明确排除这些字段,但嵌套索引嵌入式元素的字段除外( @IndexedEmbedded 中的 @IndexedEmbedded ),除非通过 includePaths 明确包含这些字段。
includeDepth=1 means that fields of this indexed-embedded element are included, unless these fields are excluded explicitly through excludePaths, but not fields of nested indexed-embedded elements (@IndexedEmbedded within this @IndexedEmbedded), unless these fields are included explicitly through includePaths.
includeDepth=2 表示会包含此索引嵌入式元素的字段,以及立即嵌套的 @IndexedEmbedded 中的 @IndexedEmbedded 的索引嵌入式元素的字段,除非通过 excludePaths 明确排除这些字段,但不会包含嵌套级别更低 @IndexedEmbedded 中的 @IndexedEmbedded 的索引嵌入式元素的字段,除非通过 includePaths 明确包含这些字段。
includeDepth=2 means that fields of this indexed-embedded element and fields of the immediately nested indexed-embedded (@IndexedEmbedded within this @IndexedEmbedded) elements are included, unless these fields are explicitly excluded through excludePaths, but not fields of nested indexed-embedded elements beyond that (@IndexedEmbedded within an @IndexedEmbedded within this @IndexedEmbedded), unless these fields are included explicitly through includePaths.
依此类推。
And so on.
默认值取决于 includePaths 属性的值:
The default value depends on the value of the includePaths attribute:
如果 includePaths_为空,则默认为 _Integer.MAX_VALUE(包含每个级别的所有字段)
if includePaths is empty, the default is Integer.MAX_VALUE (include all fields at every level)
如果 includePaths 不为空,则默认值为 0 (仅包括明确包含的字段)。
if includePaths is not empty, the default is 0 (only include fields included explicitly).
动态字段和过滤 Dynamic fields 不会直接受过滤规则影响:动态字段仅在它的父级包含时才会包含。 |
Dynamic fields and filtering Dynamic fields are not directly affected by filtering rules: a dynamic field will be included if and only if its parent is included. |
这意味着实际上 includeDepth 和 includePaths 约束只需要匹配动态字段的最近静态父级,该字段才会被包括。
This means in particular that includeDepth and includePaths constraints only need to match the nearest static parent of a dynamic field in order for that field to be included.
在不同的嵌套层级混合使用 includePaths 和 excludePaths 通常可以在嵌套的 @IndexedEmbedded 的不同层级中使用 includePaths 和 excludePaths 。在这样做时,请记住每个层级的过滤器只能引用可达路径,即过滤器无法引用被嵌套 @IndexedEmbedded (隐式或显式)排除的路径。 |
Mixing includePaths and excludePaths at different nesting levels In general, it is possible to use includePaths and excludePaths at different levels of nested @IndexedEmbedded. When doing so, keep in mind that the filter at each level can only reference reachable paths, i.e. a filter cannot reference a path that was excluded by a nested @IndexedEmbedded (implicitly or explicitly). |
下面有三个示例:一个仅利用 includePaths,一个利用 excludePaths,一个利用 includePaths 和 includeDepth。
Below are three examples: one leveraging includePaths only, one leveraging excludePaths, and one leveraging includePaths and includeDepth.
. Example 40. Filtering indexed-embedded fields with includePaths
@Entity
@Indexed
public class Human {
@Id
private Integer id;
@FullTextField(analyzer = "name")
private String name;
@FullTextField(analyzer = "name")
private String nickname;
@ManyToMany
@IndexedEmbedded(includePaths = { "name", "nickname", "parents.name" })
private List<Human> parents = new ArrayList<>();
@ManyToMany(mappedBy = "parents")
private List<Human> children = new ArrayList<>();
public Human() {
}
// Getters and setters
// ...
}
. Example 41. Filtering indexed-embedded fields with excludePaths
@Entity
@Indexed
public class Human {
@Id
private Integer id;
@FullTextField(analyzer = "name")
private String name;
@FullTextField(analyzer = "name")
private String nickname;
@ManyToMany
@IndexedEmbedded(excludePaths = { "parents.nickname", "parents.parents" })
private List<Human> parents = new ArrayList<>();
@ManyToMany(mappedBy = "parents")
private List<Human> children = new ArrayList<>();
public Human() {
}
// Getters and setters
// ...
}
. Example 42. Filtering indexed-embedded fields with includePaths and includeDepth
@Entity
@Indexed
public class Human {
@Id
private Integer id;
@FullTextField(analyzer = "name")
private String name;
@FullTextField(analyzer = "name")
private String nickname;
@ManyToMany
@IndexedEmbedded(includeDepth = 2, includePaths = { "parents.parents.name" })
private List<Human> parents = new ArrayList<>();
@ManyToMany(mappedBy = "parents")
private List<Human> children = new ArrayList<>();
public Human() {
}
// Getters and setters
// ...
}
10.6.10. Structuring embedded elements as nested documents using structure
索引嵌入字段可以用 @IndexedEmbedded 注释的 structure 属性配置的两种方式之一进行组织。为了说明结构选项,让我们假设类 Book 带 @Indexed 注释,且其 authors 属性带 @IndexedEmbedded 注释:
Indexed-embedded fields can be structured in one of two ways, configured through the structure attribute of the @IndexedEmbedded annotation. To illustrate structure options, let’s assume the class Book is annotated with @Indexed and its authors property is annotated with @IndexedEmbedded:
-
Book instance
title = 列维坦觉醒
title = Leviathan Wakes
authors =
Author 实例
Author instance
firstName = Daniel
lastName = Abraham
Author 实例
Author instance
firstName = Ty
lastName = Frank
-
title = Leviathan Wakes
-
authors =
Author 实例
Author instance
firstName = Daniel
lastName = Abraham
Author 实例
Author instance
firstName = Ty
lastName = Frank
-
Author instance
firstName = Daniel
lastName = Abraham
-
firstName = Daniel
-
lastName = Abraham
-
Author instance
firstName = Ty
lastName = Frank
-
firstName = Ty
-
lastName = Frank
DEFAULT or FLATTENED structure
默认情况下,或在使用 @IndexedEmbedded(structure = FLATTENED)(如下所示)时,索引嵌入字段“扁平化”,这意味着树结构不会被保留。
By default, or when using @IndexedEmbedded(structure = FLATTENED) as shown below, indexed-embedded fields are "flattened", meaning that the tree structure is not preserved.
. Example 43. @IndexedEmbedded with a flattened structure
@Entity
@Indexed
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String title;
@ManyToMany
@IndexedEmbedded(structure = ObjectStructure.FLATTENED) (1)
private List<Author> authors = new ArrayList<>();
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class Author {
@Id
private Integer id;
@FullTextField(analyzer = "name")
private String firstName;
@FullTextField(analyzer = "name")
private String lastName;
@ManyToMany(mappedBy = "authors")
private List<Book> books = new ArrayList<>();
public Author() {
}
// Getters and setters
// ...
}
前面提到的图书实例的索引将包含与下述结构大致类似的结构:
The book instance mentioned earlier would be indexed with a structure roughly similar to this:
-
Book document
title = 列维坦觉醒
title = Leviathan Wakes
authors.firstName = [Daniel, Ty]
authors.lastName = [Abraham, Frank]
-
title = Leviathan Wakes
-
authors.firstName = [Daniel, Ty]
-
authors.lastName = [Abraham, Frank]
authors.firstName 和 authors.lastName 字段已“扁平化”,现在每个字段都有两个值;哪个姓氏对应于哪个名字的知识已丢失。
The authors.firstName and authors.lastName fields were "flattened" and now each has two values; the knowledge of which last name corresponds to which first name has been lost.
对于索引编制和查询,这是更高效的,但在根据作者姓氏和作者名字同时查询索引时会产生意外的行为。
This is more efficient for indexing and querying, but can cause unexpected behavior when querying the index on both the author’s first name and the author’s last name.
例如,即使“Ty Abraham”并不是这本书的作者之一,上文描述的书籍实例仍会显示为与 authors.firstname:Ty AND authors.lastname:Abraham 等查询的匹配项:
For example, the book instance described above would show up as a match to a query such as authors.firstname:Ty AND authors.lastname:Abraham, even though "Ty Abraham" is not one of this book’s authors:
. Example 44. Searching with a flattened structure
List<Book> hits = searchSession.search( Book.class )
.where( f -> f.and(
f.match().field( "authors.firstName" ).matching( "Ty" ), (1)
f.match().field( "authors.lastName" ).matching( "Abraham" ) (1)
) )
.fetchHits( 20 );
assertThat( hits ).isNotEmpty(); (2)
NESTED structure
当索引嵌入元素“嵌套”时,即在如下所示情况下使用 @IndexedEmbedded(structure = NESTED) 时,树形结构会通过透明地为每个索引嵌入元素创建一个独立的“嵌套”文档而得以保留。
When indexed-embedded elements are "nested", i.e. when using @IndexedEmbedded(structure = NESTED) as shown below, the tree structure is preserved by transparently creating one separate "nested" document for each indexed-embedded element.
. Example 45. @IndexedEmbedded with a nested structure
@Entity
@Indexed
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String title;
@ManyToMany
@IndexedEmbedded(structure = ObjectStructure.NESTED) (1)
private List<Author> authors = new ArrayList<>();
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class Author {
@Id
private Integer id;
@FullTextField(analyzer = "name")
private String firstName;
@FullTextField(analyzer = "name")
private String lastName;
@ManyToMany(mappedBy = "authors")
private List<Book> books = new ArrayList<>();
public Author() {
}
// Getters and setters
// ...
}
前面提到的图书实例的索引将包含与下述结构大致类似的结构:
The book instance mentioned earlier would be indexed with a structure roughly similar to this:
-
Book document
title = 列维坦觉醒
title = Leviathan Wakes
嵌套文档
Nested documents
“authors” 的嵌套文档 1
Nested document #1 for "authors"
authors.firstName = Daniel
authors.lastName = Abraham
“authors” 的嵌套文档 2
Nested document #2 for "authors"
authors.firstName = Ty
authors.lastName = Frank
-
title = Leviathan Wakes
-
Nested documents
“authors” 的嵌套文档 1
Nested document #1 for "authors"
authors.firstName = Daniel
authors.lastName = Abraham
“authors” 的嵌套文档 2
Nested document #2 for "authors"
authors.firstName = Ty
authors.lastName = Frank
-
Nested document #1 for "authors"
authors.firstName = Daniel
authors.lastName = Abraham
-
authors.firstName = Daniel
-
authors.lastName = Abraham
-
Nested document #2 for "authors"
authors.firstName = Ty
authors.lastName = Frank
-
authors.firstName = Ty
-
authors.lastName = Frank
从本质上说,该图书编入三个文档索引:图书的根文档以及两个内部“嵌套”文档以供作者使用,保留了哪个姓氏对应于哪个名字的知识,代价是索引编制和查询的性能下降。
The book is effectively indexed as three documents: the root document for the book, and two internal, "nested" documents for the authors, preserving the knowledge of which last name corresponds to which first name at the cost of degraded performance when indexing and querying.
如果在嵌套文档中的字段上构建谓词时特别小心,则使用包含作者姓氏和名的谓词的查询会按照直觉预期的那样表现。 |
If special care is taken when building predicates on fields within nested documents, using a nested predicate, queries containing predicates on both the author’s first name and the author’s last name will behave as one would (intuitively) expect. |
例如,得益于谓词 nested (仅能在使用 NESTED 结构进行索引时使用),上面描述的图书实例不会显示为匹配查询 authors.firstname:Ty AND authors.lastname:Abraham :
For example, the book instance described above would not show up as a match to a query such as authors.firstname:Ty AND authors.lastname:Abraham, thanks to the nested predicate (which can only be used when indexing with the NESTED structure):
示例 46. 使用嵌套结构搜索
. Example 46. Searching with a nested structure
List<Book> hits = searchSession.search( Book.class )
.where( f -> f.nested( "authors" ) (1)
.add( f.match().field( "authors.firstName" ).matching( "Ty" ) ) (2)
.add( f.match().field( "authors.lastName" ).matching( "Abraham" ) ) ) (2)
.fetchHits( 20 );
assertThat( hits ).isEmpty(); (3)
10.6.11. Filtering association elements
有时,关联的某些元素才能包含在 @IndexedEmbedded 中。
Sometimes, only some elements of an association should be included in an @IndexedEmbedded.
例如,一个 Book 实体可能会索引嵌入 BookEdition 实例,但一些版本可能会被停用,需要在索引编制之前过滤掉。
For example a Book entity might index-embed BookEdition instances, but some editions might be retired and thus need to be filtered out before indexing.
通过将 @IndexedEmbedded 应用到表示已过滤关联的瞬态 getter,并通过 @AssociationInverseSide 和 @IndexingDependency.derivedFrom 配置重新索引,可以实现这种过滤。
Such filtering can be achieved by applying @IndexedEmbedded to a transient getter representing the filtered association, and configuring reindexing with @AssociationInverseSide and @IndexingDependency.derivedFrom.
示例 47. 通过瞬态 getter、 @AssociationInverseSide 和 @IndexingDependency.derivedFrom 过滤 @IndexedEmbedded 关联
. Example 47. Filtered an @IndexedEmbedded association with a transient getter, @AssociationInverseSide and @IndexingDependency.derivedFrom
@Entity
@Indexed
public class Book {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String title;
@OneToMany(mappedBy = "book")
@OrderBy("id asc")
private List<BookEdition> editions = new ArrayList<>(); (1)
public Book() {
}
// Getters and setters
// ...
@Transient (2)
@IndexedEmbedded (3)
@AssociationInverseSide(inversePath = @ObjectPath({ (4)
@PropertyValue(propertyName = "book")
}))
@IndexingDependency(derivedFrom = @ObjectPath({ (5)
@PropertyValue(propertyName = "editions"),
@PropertyValue(propertyName = "status")
}))
public List<BookEdition> getEditionsNotRetired() {
return editions.stream()
.filter( e -> e.getStatus() != BookEdition.Status.RETIRED )
.collect( Collectors.toList() );
}
}
@Entity
public class BookEdition {
public enum Status {
PUBLISHING,
RETIRED
}
@Id
private Integer id;
@ManyToOne
private Book book;
@FullTextField(analyzer = "english")
private String label;
private Status status; (6)
public BookEdition() {
}
// Getters and setters
// ...
}
10.6.12. Programmatic mapping
同样,也可以通过 programmatic mapping 将关联对象字段内嵌到主对象中。该行为和选项与基于注释的映射相同。
You can embed the fields of an associated object into the main object through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
示例 48. 使用 .indexedEmbedded() 索引关联元素
. Example 48. Using .indexedEmbedded() to index associated elements
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "title" )
.fullTextField().analyzer( "english" );
bookMapping.property( "authors" )
.indexedEmbedded();
TypeMappingStep authorMapping = mapping.type( Author.class );
authorMapping.property( "name" )
.fullTextField().analyzer( "name" );
10.7. Mapping container types with container extractors
10.7.1. Basics
应用于属性的大多数内置注解在应用于容器类型时都能透明地工作:
Most built-in annotations applied to properties will work transparently when applied to container types:
-
@GenericField applied to a property of type String will index the property value directly.
-
@GenericField applied to a property of type OptionalInt will index the optional’s value (an integer).
-
@GenericField applied to a property of type List<String> will index the list elements (strings).
-
@GenericField applied to a property of type Map<Integer, String> will index the map values (strings).
-
@GenericField applied to a property of type Map<Integer, List<String>> will index the list elements in the map values (strings).
-
Etc.
同样适用于其他字段注释,如 @FullTextField,特别是 @IndexedEmbedded。而 @VectorField 是此行为的例外,要求 explicit instructions 从容器中提取值。
Same goes for other field annotations such as @FullTextField, as well as @IndexedEmbedded in particular. With @VectorField being an exception to this behaviour, requiring explicit instructions to extract values from a container.
幕后发生的情况是,Hibernate Search 会检查属性类型并尝试应用“容器提取器”,选择第一个能正常使用的提取器。
What happens behind the scenes is that Hibernate Search will inspect the property type and attempt to apply "container extractors", picking the first that works.
10.7.2. Explicit container extraction
在某些情况下,您会希望明确选择要使用的容器提取器。当必须为地图的键而不是值编制索引时就会出现这种情况。相关注解提供一个 extraction 属性来对此进行配置,如下面的示例中所示。
In some cases, you will want to pick the container extractors to use explicitly. This is the case when a map’s keys must be indexed, instead of the values. Relevant annotations offer an extraction attribute to configure this, as shown in the example below.
示例 49. 使用显式容器提取器定义将 Map 关键字映射到索引字段 |
. Example 49. Mapping Map keys to an index field using explicit container extractor definition |
@ElementCollection (1)
@JoinTable(name = "book_pricebyformat")
@MapKeyColumn(name = "format")
@Column(name = "price")
@OrderBy("format asc")
@GenericField( (2)
name = "availableFormats",
extraction = @ContainerExtraction(BuiltinContainerExtractors.MAP_KEY) (3)
)
private Map<BookFormat, BigDecimal> priceByFormat = new LinkedHashMap<>();
可以实现和使用自定义容器提取器,但在当下,Hibernate Search 不会检测到此类容器中数据更改必须触发其包含元素的重新索引。因此,相应属性必须 disable reindexing on change 。 |
It is possible to implement and use custom container extractors, but at the moment Hibernate Search will not detect that the changes to the data inside such container must trigger the reindexing of a containing element. Hence, the corresponding property must disable reindexing on change. |
有关更多信息,请参阅 HSEARCH-3688 。
See HSEARCH-3688 for more information.
10.7.3. Disabling container extraction
在某些罕见的情况下,不需要容器提取,而 @GenericField/@IndexedEmbedded 的目的是直接应用于 List/Optional/等。如需忽略默认的容器提取器,大多数注解都提供了一个 extraction 属性。如下所示进行设置以完全禁用提取:
In some rare cases, container extraction is not wanted, and the @GenericField/@IndexedEmbedded is meant to be applied to the List/Optional/etc. directly. To ignore the default container extractors, most annotations offer an extraction attribute. Set it as below to disable extraction altogether:
示例 50. 禁用容器提取
. Example 50. Disabling container extraction
@ManyToMany
@GenericField( (1)
name = "authorCount",
valueBridge = @ValueBridgeRef(type = MyCollectionSizeBridge.class), (2)
extraction = @ContainerExtraction(extract = ContainerExtract.NO) (3)
)
private List<Author> authors = new ArrayList<>();
10.7.4. Programmatic mapping
也可以通过 programmatic mapping 在定义 fields或 indexed-embeddeds 时明确选择要使用的容器抽取器。该行为和选项与基于注释的映射相同。
You can pick the container extractors to use explicitly when defining fields or indexed-embeddeds through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
示例 51. 使用 .extractor(…) / .extactors(…) 为显式容器提取器定义将 Map 关键字映射到索引字段
. Example 51. Mapping Map keys to an index field using .extractor(…)/.extactors(…) for explicit container extractor definition
bookMapping.property( "priceByFormat" )
.genericField( "availableFormats" )
.extractor( BuiltinContainerExtractors.MAP_KEY );
同样,您可以禁用容器提取。
Similarly, you can disable container extraction.
示例 52. 通过 .noExtractors() 禁用容器提取
. Example 52. Disabling container extraction with .noExtractors()
bookMapping.property( "authors" )
.genericField( "authorCount" )
.valueBridge( new MyCollectionSizeBridge() )
.noExtractors();
10.8. Mapping geo-point types
10.8.1. Basics
Hibernate Search 提供多种空间要素,例如 a distance predicate 和 a distance sort 。这些要素需要索引空间坐标。更准确地说,它需要地理坐标系中的地理点,即纬度和经度进行索引。
Hibernate Search provides a variety of spatial features such as a distance predicate and a distance sort. These features require that spatial coordinates are indexed. More precisely, it requires that a geo-point, i.e. a latitude and longitude in the geographic coordinate system, are indexed.
地理点有些例外,因为标准 Java 库中没有要表示它们的类型。因此,Hibernate 搜索定义了自己的界面 org.hibernate.search.engine.spatial.GeoPoint。由于您的模型可能会使用不同的类型来表示地理点,因此映射地理点需要一些额外的步骤。
Geo-points are a bit of an exception, because there isn’t any type in the standard Java library to represent them. For that reason, Hibernate Search defines its own interface, org.hibernate.search.engine.spatial.GeoPoint. Since your model probably uses a different type to represent geo-points, mapping geo-points requires some extra steps.
提供了两个选项:
Two options are available:
10.8.2. Using @GenericField and the GeoPoint interface
当实体模型中的地理点通过专用且不可变的类型表示时,您可以简单地让该类型实现 GeoPoint 接口,并使用简单 property/field mapping 及 @GenericField :
When geo-points are represented in your entity model by a dedicated, immutable type, you can simply make that type implement the GeoPoint interface, and use simple property/field mapping with @GenericField:
. Example 53. Mapping spatial coordinates by implementing GeoPoint and using @GenericField
@Embeddable
public class MyCoordinates implements GeoPoint { (1)
@Basic
private Double latitude;
@Basic
private Double longitude;
protected MyCoordinates() {
// For Hibernate ORM
}
public MyCoordinates(double latitude, double longitude) {
this.latitude = latitude;
this.longitude = longitude;
}
@Override
public double latitude() { (2)
return latitude;
}
@Override
public double longitude() {
return longitude;
}
}
@Entity
@Indexed
public class Author {
@Id
@GeneratedValue
private Integer id;
private String name;
@Embedded
@GenericField (3)
private MyCoordinates placeOfBirth;
public Author() {
}
// Getters and setters
// ...
}
地理点类型必须是不可变的,即该实例的纬度和经度永远不会改变。
The geo-point type must be immutable, i.e. the latitude and longitude of a given instance may never change.
这是 @GenericField 的核心假设,一般来说,所有 @*Field 注释也都是如此:将忽略坐标变化,并且不会像预期的那样触发重新索引。
This is a core assumption of @GenericField and generally all @*Field annotations: changes to the coordinates will be ignored and will not trigger reindexing as one would expect.
如果容纳您坐标的类型是可变的,请不要使用 @GenericField ,而改用 Using @GeoPointBinding, @Latitude and @Longitude 。
If the type holding your coordinates is mutable, do not use @GenericField and refer to Using @GeoPointBinding, @Latitude and @Longitude instead.
如果您的地理点类型不可变,但扩展 GeoPoint 接口不是一个选项,您还可以使用一个自定义 value bridge,将自定义地理点类型和 GeoPoint 相互转换。GeoPoint 提供了快速构建 GeoPoint 实例的静态方法。 |
If your geo-point type is immutable, but extending the GeoPoint interface is not an option, you can also use a custom value bridge converting between the custom geo-point type and GeoPoint. GeoPoint offers static methods to quickly build a GeoPoint instance. |
10.8.3. Using @GeoPointBinding, @Latitude and @Longitude
在坐标存储在可变对象中的情况下,解决方案是 @GeoPointBinding 注解。结合 @Latitude 和 @Longitude 注解,它可以映射声明纬度和经度为 double 类型的任何类型的坐标:
For cases where coordinates are stored in a mutable object, the solution is the @GeoPointBinding annotation. Combined with the @Latitude and @Longitude annotation, it can map the coordinates of any type that declares a latitude and longitude of type double:
. Example 54. Mapping spatial coordinates using @GeoPointBinding
@Entity
@Indexed
@GeoPointBinding(fieldName = "placeOfBirth") (1)
public class Author {
@Id
@GeneratedValue
private Integer id;
private String name;
@Latitude (2)
private Double placeOfBirthLatitude;
@Longitude (3)
private Double placeOfBirthLongitude;
public Author() {
}
// Getters and setters
// ...
}
@GeoPointBinding 注解也可能应用于属性,在这种情况下,@Latitude 和 @Longitude 必须应用于属性类型的属性:
The @GeoPointBinding annotation may also be applied to a property, in which case the @Latitude and @Longitude must be applied to properties of the property’s type:
. Example 55. Mapping spatial coordinates using @GeoPointBinding on a property
@Embeddable
public class MyCoordinates { (1)
@Basic
@Latitude (2)
private Double latitude;
@Basic
@Longitude (3)
private Double longitude;
protected MyCoordinates() {
// For Hibernate ORM
}
public MyCoordinates(double latitude, double longitude) {
this.latitude = latitude;
this.longitude = longitude;
}
public double getLatitude() {
return latitude;
}
public void setLatitude(Double latitude) { (4)
this.latitude = latitude;
}
public double getLongitude() {
return longitude;
}
public void setLongitude(Double longitude) {
this.longitude = longitude;
}
}
@Entity
@Indexed
public class Author {
@Id
@GeneratedValue
private Integer id;
@FullTextField(analyzer = "name")
private String name;
@Embedded
@GeoPointBinding (5)
private MyCoordinates placeOfBirth;
public Author() {
}
// Getters and setters
// ...
}
可以通过多次应用注解并将 markerSet 属性设置为唯一值来处理多组坐标:
It is possible to handle multiple sets of coordinates by applying the annotations multiple times and setting the markerSet attribute to a unique value:
. Example 56. Mapping multiple sets of spatial coordinates using @GeoPointBinding
@Entity
@Indexed
@GeoPointBinding(fieldName = "placeOfBirth", markerSet = "birth") (1)
@GeoPointBinding(fieldName = "placeOfDeath", markerSet = "death") (2)
public class Author {
@Id
@GeneratedValue
private Integer id;
@FullTextField(analyzer = "name")
private String name;
@Latitude(markerSet = "birth") (3)
private Double placeOfBirthLatitude;
@Longitude(markerSet = "birth") (4)
private Double placeOfBirthLongitude;
@Latitude(markerSet = "death") (5)
private Double placeOfDeathLatitude;
@Longitude(markerSet = "death") (6)
private Double placeOfDeathLongitude;
public Author() {
}
// Getters and setters
// ...
}
10.8.4. Programmatic mapping
同样,也可以通过 programmatic mapping映射地理点字段文档标识符。该行为和选项与基于注释的映射相同。
You can map geo-point fields document identifier through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
. Example 57. Mapping spatial coordinates by implementing GeoPoint and using .genericField()
TypeMappingStep authorMapping = mapping.type( Author.class );
authorMapping.indexed();
authorMapping.property( "placeOfBirth" )
.genericField();
. Example 58. Mapping spatial coordinates using GeoPointBinder
TypeMappingStep authorMapping = mapping.type( Author.class );
authorMapping.indexed();
authorMapping.binder( GeoPointBinder.create().fieldName( "placeOfBirth" ) );
authorMapping.property( "placeOfBirthLatitude" )
.marker( GeoPointBinder.latitude() );
authorMapping.property( "placeOfBirthLongitude" )
.marker( GeoPointBinder.longitude() );
10.9. Mapping multiple alternatives
10.9.1. Basics
在某些情况下,根据另一个属性的值,必须以不同的方式对特定属性进行索引。
In some situations, it is necessary for a particular property to be indexed differently depending on the value of another property.
例如,可能有一个实体,其文本属性的内容根据另一个属性的值(例如 language)以不同的语言显示。在这种情况下,您可能希望根据语言对文本进行不同的分析。
For example there may be an entity that has text properties whose content is in a different language depending on the value of another property, say language. In that case, you probably want to analyze the text differently depending on the language.
虽然肯定可以用自定义 type bridge 解决此问题,但便捷的解决方案是使用 AlternativeBinder。此粘合剂以这种方式解决此问题:
While this could definitely be solved with a custom type bridge, a convenient solution to that problem is to use the AlternativeBinder. This binder solves the problem this way:
-
at bootstrap, declare one index field per language, assigning a different analyzer to each field;
-
at runtime, put the content of the text property in a different field based on the language.
为了使用此粘合剂,您需要:
In order to use this binder, you will need to:
-
annotate a property with @AlternativeDiscriminator (e.g. the language property);
-
implement an AlternativeBinderDelegate that will declare the index fields (e.g. one field per language) and create an AlternativeValueBridge. This bridge is responsible for passing the property value to the relevant field at runtime.
-
apply the AlternativeBinder to the type hosting the properties (e.g. the type declaring the language property and the multi-language text properties). Generally you will want to create your own annotation for that.
下面是使用粘合剂的示例。
Below is an example of how to use the binder.
. Example 59. Mapping a property to a different index field based on a language property using AlternativeBinder
public enum Language { (1)
ENGLISH( "en" ),
FRENCH( "fr" ),
GERMAN( "de" );
public final String code;
Language(String code) {
this.code = code;
}
}
@Entity
@Indexed
public class BlogEntry {
@Id
private Integer id;
@AlternativeDiscriminator (1)
@Enumerated(EnumType.STRING)
private Language language;
@MultiLanguageField (2)
private String text;
// Getters and setters
// ...
}
@Retention(RetentionPolicy.RUNTIME) (1)
@Target({ ElementType.METHOD, ElementType.FIELD }) (2)
@PropertyMapping(processor = @PropertyMappingAnnotationProcessorRef( (3)
type = MultiLanguageField.Processor.class
))
@Documented (4)
public @interface MultiLanguageField {
String name() default ""; (5)
class Processor (6)
implements PropertyMappingAnnotationProcessor<MultiLanguageField> { (7)
@Override
public void process(PropertyMappingStep mapping, MultiLanguageField annotation,
PropertyMappingAnnotationProcessorContext context) {
LanguageAlternativeBinderDelegate delegate = new LanguageAlternativeBinderDelegate( (8)
annotation.name().isEmpty() ? null : annotation.name()
);
mapping.hostingType() (9)
.binder( AlternativeBinder.create( (10)
Language.class, (11)
context.annotatedElement().name(), (12)
String.class, (13)
BeanReference.ofInstance( delegate ) (14)
) );
}
}
}
public class LanguageAlternativeBinderDelegate implements AlternativeBinderDelegate<Language, String> { (1)
private final String name;
public LanguageAlternativeBinderDelegate(String name) { (2)
this.name = name;
}
@Override
public AlternativeValueBridge<Language, String> bind(IndexSchemaElement indexSchemaElement, (3)
PojoModelProperty fieldValueSource) {
EnumMap<Language, IndexFieldReference<String>> fields = new EnumMap<>( Language.class );
String fieldNamePrefix = ( name != null ? name : fieldValueSource.name() ) + "_";
for ( Language language : Language.values() ) { (4)
String languageCode = language.code;
IndexFieldReference<String> field = indexSchemaElement.field(
fieldNamePrefix + languageCode, (5)
f -> f.asString().analyzer( "text_" + languageCode ) (6)
)
.toReference();
fields.put( language, field );
}
return new Bridge( fields ); (7)
}
private static class Bridge (8)
implements AlternativeValueBridge<Language, String> { (9)
private final EnumMap<Language, IndexFieldReference<String>> fields;
private Bridge(EnumMap<Language, IndexFieldReference<String>> fields) {
this.fields = fields;
}
@Override
public void write(DocumentElement target, Language discriminator, String bridgedElement) {
target.addValue( fields.get( discriminator ), bridgedElement ); (10)
}
}
}
10.9.2. Programmatic mapping
同样,也可以通过 programmatic mapping 应用 AlternativeBinder。该行为和选项与基于注释的映射相同。
You can apply AlternativeBinder through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
. Example 60. Applying an AlternativeBinder with .binder(…)
TypeMappingStep blogEntryMapping = mapping.type( BlogEntry.class );
blogEntryMapping.indexed();
blogEntryMapping.property( "language" )
.marker( AlternativeBinder.alternativeDiscriminator() );
LanguageAlternativeBinderDelegate delegate = new LanguageAlternativeBinderDelegate( null );
blogEntryMapping.binder( AlternativeBinder.create( Language.class,
"text", String.class, BeanReference.ofInstance( delegate ) ) );
10.10. Tuning when to trigger reindexing
10.10.1. Basics
当实体属性映射到索引(通过 @GenericField、@IndexedEmbedded 或 custom bridge)时,此映射会引入依赖关系:在属性更改时需要更新文档。
When an entity property is mapped to the index, be it through @GenericField, @IndexedEmbedded, or a custom bridge, this mapping introduces a dependency: the document will need to be updated when the property changes.
对于更简单的单实体映射,这只意味着 Hibernate 搜索需要检测实体何时更改并重新索引实体。这将以透明的方式处理。
For simpler, single-entity mappings, this only means that Hibernate Search will need to detect when an entity changes and reindex the entity. This will be handled transparently.
如果映射包含一个“派生”属性,即一个未直接持久化的属性,而是动态地在使用其他属性作为输入的 getter 中计算的,那么 Hibernate Search 将无法猜测这些属性所基于的持久状态的哪一部分。在这种情况下,将需要一些显式配置;有关详细信息,请参阅 Reindexing when a derived value changes with @IndexingDependency 。
If the mapping includes a "derived" property, i.e. a property that is not persisted directly, but instead is dynamically computed in a getter that uses other properties as input, Hibernate Search will be unable to guess which part of the persistent state these properties are based on. In this case, some explicit configuration will be required; see Reindexing when a derived value changes with @IndexingDependency for more information.
当映射跨越实体边界时,事情会变得更加复杂。让我们考虑一个 Book 实体映射到一个文档的映射,该文档必须包含 Author 实体的 name 属性(例如,使用 @IndexedEmbedded )。每当作者姓名更改时,Hibernate Search 都需要 retrieve all the books of that author ,以重新对它们编制索引。
When the mapping crosses the entity boundaries, things get more complicated. Let’s consider a mapping where a Book entity is mapped to a document, and that document must include the name property of the Author entity (for example using @IndexedEmbedded). Whenever an author’s name changes, Hibernate Search will need to retrieve all the books of that author, to reindex them.
实际上,这意味着每当实体映射依赖于与另一个实体的关联时,此关联必须是双向的:如果 Book.authors 是 @IndexedEmbedded,则 Hibernate 搜索必须意识到反向关联 Author.books。如果无法解析反向关联,则将在启动时抛出异常。
In practice, this means that whenever an entity mapping relies on an association to another entity, this association must be bidirectional: if Book.authors is @IndexedEmbedded, Hibernate Search must be aware of an inverse association Author.books. An exception will be thrown on startup if the inverse association cannot be resolved.
多数时候,当使用 Hibernate ORM integration 时,Hibernate 搜索能够利用 Hibernate ORM 元数据(@OneToOne 和 @OneToMany 的 mappedBy 属性)解析关联的逆向端,因此这一切都可以透明处理。
Most of the time, when the Hibernate ORM integration is used, Hibernate Search is able to take advantage of Hibernate ORM metadata (the mappedBy attribute of @OneToOne and @OneToMany) to resolve the inverse side of an association, so this is all handled transparently.
在某些罕见的情况下,对于更复杂的映射,即使 Hibernate ORM 也可能不知道关联是双向的,这是因为 _mappedBy_无法使用,或因为正在使用 Standalone POJO Mapper。有一些解决方案:
In some rare cases, with the more complex mappings, it is possible that even Hibernate ORM is not aware that an association is bidirectional, because mappedBy cannot be used, or because the Standalone POJO Mapper is being used. A few solutions exist:
-
The association can simply be ignored. This means the index will be out of date whenever associated entities change, but this can be an acceptable solution if the index is rebuilt periodically. See Limiting reindexing of containing entities with @IndexingDependency for more information.
-
If the association is actually bidirectional, its inverse side can be specified to Hibernate Search explicitly using @AssociationInverseSide. See Enriching the entity model with @AssociationInverseSide for more information.
10.10.2. Enriching the entity model with @AssociationInverseSide
针对来自实体类型 A 到实体类型 B 的关联,@AssociationInverseSide 定义关联的逆向,即从 B 到 A 的路径。
Given an association from an entity type A to entity type B, @AssociationInverseSide defines the inverse side of an association, i.e. the path from B to A.
当使用 Standalone POJO Mapper 或 Hibernate ORM integration 且 Hibernate ORM 中没有将双向关联映射为这样的关联(没有 mappedBy)时,此功能尤其实用。
This is mostly useful when using the Standalone POJO Mapper or when using the Hibernate ORM integration and a bidirectional association is not mapped as such in Hibernate ORM (no mappedBy).
. Example 61. Mapping the inverse side of an association with @AssociationInverseSide
@Entity
@Indexed
public class Book {
@Id
@GeneratedValue
private Integer id;
private String title;
@ElementCollection (1)
@JoinTable(
name = "book_editionbyprice",
joinColumns = @JoinColumn(name = "book_id")
)
@MapKeyJoinColumn(name = "edition_id")
@Column(name = "price")
@OrderBy("edition_id asc")
@IndexedEmbedded( (2)
name = "editionsForSale",
extraction = @ContainerExtraction(BuiltinContainerExtractors.MAP_KEY)
)
@AssociationInverseSide( (3)
extraction = @ContainerExtraction(BuiltinContainerExtractors.MAP_KEY),
inversePath = @ObjectPath(@PropertyValue(propertyName = "book"))
)
private Map<BookEdition, BigDecimal> priceByEdition = new LinkedHashMap<>();
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class BookEdition {
@Id
@GeneratedValue
private Integer id;
@ManyToOne (4)
private Book book;
@FullTextField(analyzer = "english")
private String label;
public BookEdition() {
}
// Getters and setters
// ...
}
10.10.3. Reindexing when a derived value changes with @IndexingDependency
当属性不是直接保留,而是在使用其他属性作为输入的 getter 中动态计算时,Hibernate 搜索将无法猜测这些属性基于持续状态的哪一部分,从而无法在相关持续状态更改时 trigger reindexing。默认情况下,Hibernate 搜索将在引导时检测此类情况并引发异常。
When a property is not persisted directly, but instead is dynamically computed in a getter that uses other properties as input, Hibernate Search will be unable to guess which part of the persistent state these properties are based on, and thus will be unable to trigger reindexing when the relevant persistent state changes. By default, Hibernate Search will detect such cases on bootstrap and throw an exception.
使用 @IndexingDependency(derivedFrom = …) 为属性添加注释将为 Hibernate 搜索提供所需的信息,并允许 triggering reindexing。
Annotating the property with @IndexingDependency(derivedFrom = …) will give Hibernate Search the information it needs and allow triggering reindexing.
. Example 62. Mapping a derived value with @IndexingDependency.derivedFrom
@Entity
@Indexed
public class Book {
@Id
@GeneratedValue
private Integer id;
private String title;
@ElementCollection
private List<String> authors = new ArrayList<>(); (1)
public Book() {
}
// Getters and setters
// ...
@Transient (2)
@FullTextField(analyzer = "name") (3)
@IndexingDependency(derivedFrom = @ObjectPath({ (4)
@PropertyValue(propertyName = "authors")
}))
public String getMainAuthor() {
return authors.isEmpty() ? null : authors.get( 0 );
}
}
10.10.4. Limiting reindexing of containing entities with @IndexingDependency
在有的情况下,每次给定属性更改时, triggering reindexing实体是不切实际的:
In some cases, triggering reindexing of entities every time a given property changes is not realistically achievable:
-
When an association is massive, for example a single entity instance is indexed-embedded in thousands of other entities.
-
When a property mapped to the index is updated very frequently, leading to a very frequent reindexing and unacceptable usage of disks or database.
-
Etc.
当这种情况发生时,可以告诉 Hibernate Search 忽略对特定属性(且在 @IndexedEmbedded 情况下,忽略该属性以外的任何内容)的更新。
When that happens, it is possible to tell Hibernate Search to ignore updates to a particular property (and, in the case of @IndexedEmbedded, anything beyond that property).
有若干选项可用于确切控制对给定属性的更新如何影响重新索引。请参阅以下章节,了解每个选项的说明。
Several options are available to control exactly how updates to a given property affect reindexing. See the sections below for an explanation of each option.
ReindexOnUpdate.SHALLOW: limiting reindexing to same-entity updates only
ReindexOnUpdate.SHALLOW 在关联高度不对称,因此单向时最有用。考虑诸如类别、类型、城市、国家/地区等“参考”数据的关联…
ReindexOnUpdate.SHALLOW is most useful when an association is highly asymmetric and therefore unidirectional. Think associations to "reference" data such as categories, types, cities, countries, …
它实质上告知 Hibernate Search,更改关联会触发更改所发生对象的重新索引(添加或删除关联元素,即“浅层”更新),但更改关联实体的属性(“深层”更新)不会触发重新索引。
It essentially tells Hibernate Search that changing an association — adding or removing associated elements, i.e. "shallow" updates — should trigger reindexing of the object on which the change happened, but changing properties of associated entities — "deep" updates — should not.
例如,考虑以下(不正确的)映射:
For example, let’s consider the (incorrect) mapping below:
. Example 63. A highly-asymmetric, unidirectional association
@Entity
@Indexed
public class Book {
@Id
private Integer id;
private String title;
@ManyToOne (1)
@IndexedEmbedded (2)
private BookCategory category;
public Book() {
}
// Getters and setters
// ...
}
@Entity
public class BookCategory {
@Id
private Integer id;
@FullTextField(analyzer = "english")
private String name;
(3)
// Getters and setters
// ...
}
有了此映射,当类别名称发生更改时,Hibernate Search 将无法重新索引所有书籍:该类别将列出所有书籍的 getter 根本不存在。由于 Hibernate Search 尝试默认保持安全,因此它将拒绝此映射,并会在引导时抛出异常,指出其需要 Book → BookCategory 关联的逆向。
With this mapping, Hibernate Search will not be able to reindex all books when the category name changes: the getter that would list all books for that category simply doesn’t exist. Since Hibernate Search tries to be safe by default, it will reject this mapping and throw an exception at bootstrap, saying it needs an inverse side to the Book → BookCategory association.
然而,在此情况下,我们不期望 _BookCategory_名称发生更改。这确实是 “引用” 数据,更改频率很低,因此我们可以计划这种情况并在那时 reindex all books。因此,我们确实不介意 Hibernate 搜索忽略对 _BookCategory_的更改……
However, in this case, we don’t expect the name of a BookCategory to change. That’s really "reference" data, which changes so rarely that we can conceivably plan ahead such change and reindex all books whenever that happens. So we would really not mind if Hibernate Search just ignored changes to BookCategory…
这就是 @IndexingDependency(reindexOnUpdate = ReindexOnUpdate.SHALLOW) 的用处:它告诉 Hibernate Search 忽略对关联实体进行更新的影响。请参阅以下修改后的映射:
That’s what @IndexingDependency(reindexOnUpdate = ReindexOnUpdate.SHALLOW) is for: it tells Hibernate Search to ignore the impact of updates to an associated entity. See the modified mapping below:
. Example 64. Limiting reindexing to same-entity updates with ReindexOnUpdate.SHALLOW
@Entity
@Indexed
public class Book {
@Id
private Integer id;
private String title;
@ManyToOne
@IndexedEmbedded
@IndexingDependency(reindexOnUpdate = ReindexOnUpdate.SHALLOW) (1)
private BookCategory category;
public Book() {
}
// Getters and setters
// ...
}
Hibernate Search 将接受上述映射并成功启动,因为从 Book 到 BookCategory 的关联的逆向不再是被认为是必要的。
Hibernate Search will accept the mapping above and boot successfully, since the inverse side of the association from Book to BookCategory is no longer deemed necessary.
只有 shallow 对书籍类别的更改会触发对该书籍的重新索引:
Only shallow changes to a book’s category will trigger reindexing of that book:
-
When a book is assigned a new category (book.setCategory( newCategory )), Hibernate Search will consider it a "shallow" change, since it only affects the Book entity. Thus, Hibernate Search will reindex the book.
-
When a category itself changes (category.setName( newName )), Hibernate Search will consider it a "deep" change, since it occurs beyond the boundaries of the Book entity. Thus, Hibernate Search will not reindex books of that category by itself. The index will become slightly out-of-sync, but this can be solved by reindexing Book entities, for example every night.
ReindexOnUpdate.NO: disabling reindexing caused by updates of a particular property
ReindexOnUpdate.NO 最适用于非常频繁更改且不需要在索引中保持最新的属性。
ReindexOnUpdate.NO is most useful for properties that change very frequently and don’t need to be up-to-date in the index.
这实质上告诉 Hibernate 搜索不应该 trigger reindexing对该属性进行的更改。
It essentially tells Hibernate Search that changes to that property should not trigger reindexing,
例如,考虑以下映射:
For example, let’s consider the mapping below:
. Example 65. A frequently-changing property
@Entity
@Indexed
public class Sensor {
@Id
private Integer id;
@FullTextField
private String name; (1)
@KeywordField
private SensorStatus status; (1)
@Column(name = "\"value\"")
private double value; (2)
@GenericField
private double rollingAverage; (3)
public Sensor() {
}
// Getters and setters
// ...
}
名称和状态的更新(更新频率很低)完全能够触发重新索引。但是考虑到有成千上万个传感器,传感器值的更新不可能对重新索引产生合理的影响:每隔几毫秒重新索引数千个传感器可能不会有很好的效果。
Updates to the name and status, which are rarely updated, can perfectly well trigger reindexing. But considering there are thousands of sensors, updates to the sensor value cannot reasonably trigger reindexing: reindexing thousands of sensors every few milliseconds probably won’t perform well.
然而,在此场景中,对传感器值进行搜索并不被认为是关键的,索引无需最新。当涉及传感器值时,我们能够接受索引落后几分钟。我们可以考虑每隔几秒钟设置一个批处理进程,以通过 mass indexer(使用 Jakarta Batch mass indexing job)或 explicitly重新索引所有传感器。因此,我们确实不介意 Hibernate 搜索忽略对传感器值的更改……
In this scenario, however, search on sensor value is not considered critical and indexes don’t need to be as fresh. We can accept indexes to lag behind a few minutes when it comes to a sensor value. We can consider setting up a batch process that runs every few seconds to reindex all sensors, either through a mass indexer, using the Jakarta Batch mass indexing job, or explicitly. So we would really not mind if Hibernate Search just ignored changes to sensor values…
@IndexingDependency(reindexOnUpdate = ReindexOnUpdate.NO) 正是为此而设计的:它让 Hibernate Search 忽略更新对 rollingAverage 属性的影响。下面是修改过的映射:
That’s what @IndexingDependency(reindexOnUpdate = ReindexOnUpdate.NO) is for: it tells Hibernate Search to ignore the impact of updates to the rollingAverage property. See the modified mapping below:
. Example 66. Disabling listener-triggered reindexing for a particular property with ReindexOnUpdate.NO
@Entity
@Indexed
public class Sensor {
@Id
private Integer id;
@FullTextField
private String name;
@KeywordField
private SensorStatus status;
@Column(name = "\"value\"")
private double value;
@GenericField
@IndexingDependency(reindexOnUpdate = ReindexOnUpdate.NO) (1)
private double rollingAverage;
public Sensor() {
}
// Getters and setters
// ...
}
使用这种映射:
With this mapping:
-
When a sensor is assigned a new name (sensor.setName( newName )) or status (sensor.setStatus( newStatus )), Hibernate Search will trigger reindexing of the sensor.
-
When a sensor is assigned a new rolling average (sensor.setRollingAverage( newName )), Hibernate Search will not trigger reindexing of the sensor.
10.10.5. Programmatic mapping
同样,也可以通过 programmatic mapping 控制重新索引。该行为和选项与基于注释的映射相同。
You can control reindexing through the programmatic mapping too. Behavior and options are identical to annotation-based mapping.
. Example 67. Mapping the inverse side of an association with .associationInverseSide(…)
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "priceByEdition" )
.indexedEmbedded( "editionsForSale" )
.extractor( BuiltinContainerExtractors.MAP_KEY )
.associationInverseSide( PojoModelPath.parse( "book" ) )
.extractor( BuiltinContainerExtractors.MAP_KEY );
TypeMappingStep bookEditionMapping = mapping.type( BookEdition.class );
bookEditionMapping.property( "label" )
.fullTextField().analyzer( "english" );
. Example 68. Mapping a derived value with .indexingDependency().derivedFrom(…)
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "mainAuthor" )
.fullTextField().analyzer( "name" )
.indexingDependency().derivedFrom( PojoModelPath.parse( "authors" ) );
. Example 69. Limiting triggering reindexing with .indexingDependency().reindexOnUpdate(…)
TypeMappingStep bookMapping = mapping.type( Book.class );
bookMapping.indexed();
bookMapping.property( "category" )
.indexedEmbedded()
.indexingDependency().reindexOnUpdate( ReindexOnUpdate.SHALLOW );
TypeMappingStep bookCategoryMapping = mapping.type( BookCategory.class );
bookCategoryMapping.property( "name" )
.fullTextField().analyzer( "english" );
10.11. Changing the mapping of an existing application
在应用程序的生命周期中,会发生这种情况:特定索引实体类型的映射必须更改。当这种情况发生时,映射更改很可能需要更改索引的结构,即其 schema 。Hibernate Search 不会自动处理此结构更改,因此需要手动干预。
Over the lifetime of an application, it will happen that the mapping of a particular indexed entity type has to change. When this happens, the mapping changes are likely to require changes to the structure of the index, i.e. its schema. Hibernate Search does not handle this structure change automatically, so manual intervention is required.
当需要更改索引结构时,最简单的解决方案是:
The simplest solution when the index structure needs to change is to:
-
Drop and re-create the index and its schema, either manually by deleting the filesystem directory for Lucene or using the REST API to delete the index for Elasticsearch, or using Hibernate Search’s schema management features.
-
Re-populate the index, for example using the mass indexer.
从技术上讲,如果映射更改包括 only ,则不必 strictly 删除索引并重新索引:
Technically, dropping the index and reindexing is not strictly required if the mapping changes include only: |
添加不会有任何持久化实例的新索引实体,例如在数据库中没有任何行的实体上添加 @Indexed 注释。
adding new indexed entities that will not have any persisted instance, e.g. adding an @Indexed annotation on an entity which has no rows in database.
添加当前所有持久化实体都为空的新字段,例如,在实体类型上添加新属性,并将其映射到字段,但保证此属性最初对于此实体的每个实例都将为 null;
adding new fields that will be empty for all currently persisted entities, e.g. adding a new property on an entity type and mapping it to a field, but with the guarantee that this property will initially be null for every instance of this entity;
和/或从现有索引/字段中删除数据,例如删除索引字段,或删除对字段存储的需求。
and/or removing data from existing indexes/fields, e.g. removing an index field, or removing the need for a field to be stored.
但是,你仍然需要:
However, you will still need to:
创建缺失的索引:这通常可以通过使用 create 、 create-or-validate 或 create-or-update 架构管理策略启动应用程序来自动完成。
create missing indexes: this can generally be done automatically by starting up the application with the create, create-or-validate, or create-or-update schema management strategy.
(仅限 Elasticsearch:)更新现有索引的架构以声明新字段。这将更加复杂:要么使用 Elasticsearch 的 REST API 手动执行,要么使用 create-or-update strategy 启动应用程序,但请注意它 may fail 。
(Elasticsearch only:) update the schema of existing indexes to declare the new fields. This will be more complex: either do it manually using Elasticsearch’s REST API, or start up the application with the create-or-update strategy, but be warned that it may fail.
10.12. Custom mapping annotations
10.12.1. Basics
默认情况下,Hibernate Search 仅识别内置映射注释,例如 @Indexed、@GenericField 或 @IndexedEmbedded。
By default, Hibernate Search only recognizes built-in mapping annotations such as @Indexed, @GenericField or @IndexedEmbedded.
要在 Hibernate Search 映射中使用自定义注释,则需要两个步骤:
To use custom annotations in a Hibernate Search mapping, two steps are required:
-
Implementing a processor for that annotation: TypeMappingAnnotationProcessor for type annotations, PropertyMappingAnnotationProcessor for method/field annotations, ConstructorMappingAnnotationProcessor for constructor annotations, or MethodParameterMappingAnnotationProcessor for constructor parameter annotations.
-
Annotating the custom annotation with either @TypeMapping, @PropertyMapping, @ConstructorMapping, or @MethodParameterMapping, passing as an argument the reference to the annotation processor.
完成后,Hibernate Search 将能够检测索引类中的自定义注释(尽管不一定出现在自定义投影类型中,请参见 Custom root mapping annotations )。每当遇到自定义注释时,Hibernate Search 会实例化注释处理器并调用其 process 方法,同时将以下内容作为参数传递:
Once this is done, Hibernate Search will be able to detect custom annotations in indexed classes (though not necessarily in custom projection types, see Custom root mapping annotations). Whenever a custom annotation is encountered, Hibernate Search will instantiate the annotation processor and call its process method, passing the following as arguments:
-
A mapping parameter allowing to define the mapping for the type, property, constructor, or constructor parameter using the programmatic mapping API.
-
An annotation parameter representing the annotation instance.
-
A context object with various helpers.
自定义注释最常被用于应用自定义的参数化绑定器或转换器。你可以在以下部分中找到示例:
Custom annotations are most frequently used to apply custom, parameterized binders or bridges. You can find examples in these sections in particular:
-
Passing parameters to a value binder/bridge through a custom annotation
-
Passing parameters to a property binder/bridge through a custom annotation
-
Passing parameters to a type binder/bridge through a custom annotation
-
Passing parameters to an identifier binder/bridge through a custom annotation
-
Passing parameters to a projection binder through a custom annotation
完全有可能使用自定义注释用于无参数粘合器或桥接,甚至用于更复杂的功能,例如索引嵌入: programmatic mapping API 中的每个可用功能都可以由自定义注释触发。
It is completely possible to use custom annotations for parameter-less binders or bridges, or even for more complex features such as indexed-embedded: every feature available in the programmatic mapping API can be triggered by a custom annotation. |
10.12.2. Custom root mapping annotations
若要让 Hibernate 搜索将自定义注释视为 root mapping annotation,请将 @RootMapping 元注释添加到自定义注释中。
To have Hibernate Search consider a custom annotation as a root mapping annotation, add the @RootMapping meta-annotation to the custom annotation.
这将确保 Hibernate 搜索处理包含已应用自定义注释的类型上的注释,即使这些类型没有在索引映射中被引用,这对于与 projection mapping 相关的自定义注释十分实用。
This will ensure Hibernate Search processes annotations on types annotated with the custom annotation even if those types are not referenced in the index mapping, which is mainly useful for custom annotations related to projection mapping.
10.13. Inspecting the mapping
在 Hibernate Search 成功启动后,可以使用 SearchMapping 获取已索引实体列表并更直接地访问相应的索引,如下面的示例所示。
After Hibernate Search has successfully booted, the SearchMapping can be used to get a list of indexed entities and get more direct access to the corresponding indexes, as shown in the example below.
. Example 70. Accessing indexed entities
SearchMapping mapping = /* ... */ (1)
SearchIndexedEntity<Book> bookEntity = mapping.indexedEntity( Book.class ); (2)
String jpaName = bookEntity.jpaName(); (3)
IndexManager indexManager = bookEntity.indexManager(); (4)
Backend backend = indexManager.backend(); (5)
SearchIndexedEntity<?> bookEntity2 = mapping.indexedEntity( "Book" ); (6)
Class<?> javaClass = bookEntity2.javaClass();
for ( SearchIndexedEntity<?> entity : mapping.allIndexedEntities() ) { (7)
// ...
}
然后,您可以从一个 _IndexManager_访问索引元模型,以检查可用字段及其主要特性,如下所示。
From an IndexManager, you can then access the index metamodel, to inspect available fields and their main characteristics, as shown below.
. Example 71. Accessing the index metamodel
SearchIndexedEntity<Book> bookEntity = mapping.indexedEntity( Book.class ); (1)
IndexManager indexManager = bookEntity.indexManager(); (2)
IndexDescriptor indexDescriptor = indexManager.descriptor(); (3)
indexDescriptor.field( "releaseDate" ).ifPresent( field -> { (4)
String path = field.absolutePath(); (5)
String relativeName = field.relativeName();
// Etc.
if ( field.isValueField() ) { (6)
IndexValueFieldDescriptor valueField = field.toValueField(); (7)
IndexValueFieldTypeDescriptor type = valueField.type(); (8)
boolean projectable = type.projectable();
Class<?> dslArgumentClass = type.dslArgumentClass();
Class<?> projectedValueClass = type.projectedValueClass();
Optional<String> analyzerName = type.analyzerName();
Optional<String> searchAnalyzerName = type.searchAnalyzerName();
Optional<String> normalizerName = type.normalizerName();
// Etc.
Set<String> traits = type.traits(); (9)
if ( traits.contains( IndexFieldTraits.Aggregations.RANGE ) ) {
// ...
}
}
else if ( field.isObjectField() ) { (10)
IndexObjectFieldDescriptor objectField = field.toObjectField();
IndexObjectFieldTypeDescriptor type = objectField.type();
boolean nested = type.nested();
// Etc.
}
} );
Collection<? extends AnalyzerDescriptor> analyzerDescriptors = indexDescriptor.analyzers(); (11)
for ( AnalyzerDescriptor analyzerDescriptor : analyzerDescriptors ) {
String analyzerName = analyzerDescriptor.name();
// ...
}
Optional<? extends AnalyzerDescriptor> analyzerDescriptor = indexDescriptor.analyzer( "some-analyzer-name" ); (12)
// ...
Collection<? extends NormalizerDescriptor> normalizerDescriptors = indexDescriptor.normalizers(); (13)
for ( NormalizerDescriptor normalizerDescriptor : normalizerDescriptors ) {
String normalizerName = normalizerDescriptor.name();
// ...
}
Optional<? extends NormalizerDescriptor> normalizerDescriptor = indexDescriptor.normalizer( "some-normalizer-name" ); (14)
// ...
Backend 和 IndexManager 也可以用于 retrieve the Elasticsearch REST client 或 retrieve Lucene analyzers。 |
The Backend and IndexManager can also be used to retrieve the Elasticsearch REST client or retrieve Lucene analyzers. |
@{15} 还公开了一些方法,用于按照名称检索 IndexManager,甚至按照名称检索整个 Backend。
The SearchMapping also exposes methods to retrieve an IndexManager by name, or even a whole Backend by name.