Neo4j
本部分指导你设置 Neo4jVectorStore
,以存储文档嵌入并执行相似性搜索。
This section walks you through setting up Neo4jVectorStore
to store document embeddings and perform similarity searches.
Neo4j 是开源 NoSQL 图形数据库。它是一个完全事务性的数据库 (ACID),用于存储以图形形式构建的数据,这些图形由节点组成,并通过关系连接。受到现实世界的结构启发,它可以在复杂数据上实现较高的查询性能,同时对开发者来说仍然直观且简单。
Neo4j is an open-source NoSQL graph database. It is a fully transactional database (ACID) that stores data structured as graphs consisting of nodes, connected by relationships. Inspired by the structure of the real world, it allows for high query performance on complex data while remaining intuitive and simple for the developer.
Neo4j’s Vector Search 允许用户从大型数据集查询向量嵌入。嵌入是数据对象的数值表示,例如文本、图像、音频或文档。嵌入可以存储在 Node 属性上,并且可以使用 db.index.vector.queryNodes()
函数进行查询。这些索引由 Lucene 提供支持,使用层次式可导航小世界图 (HNSW) 在向量字段上执行 k 近似最近邻 (k-ANN) 查询。
The Neo4j’s Vector Search allows users to query vector embeddings from large datasets.
An embedding is a numerical representation of a data object, such as text, image, audio, or document.
Embeddings can be stored on Node properties and can be queried with the db.index.vector.queryNodes()
function.
Those indexes are powered by Lucene using a Hierarchical Navigable Small World Graph (HNSW) to perform a k approximate nearest neighbors (k-ANN) query over the vector fields.
Prerequisites
-
A running Neo4j (5.15+) instance. The following options are available:
-
Docker image
-
Neo4j Server instance
-
-
If required, an API key for the EmbeddingClient to generate the embeddings stored by the
Neo4jVectorStore
.
Dependencies
将 Neo4j 矢量存储依赖项添加到您的项目:
Add the Neo4j Vector Store dependency to your project:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-neo4j-store</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-neo4j-store'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Configuration
要连接到 Neo4j 并使用 Neo4jVectorStore
,您需要提供您实例的访问详细信息。可以通过 Spring Boot 的 application.properties 提供一个简单的配置,
To connect to Neo4j and use the Neo4jVectorStore
, you need to provide access details for your instance.
A simple configuration can either be provided via Spring Boot’s application.properties,
spring.neo4j.uri=<uri_for_your_neo4j_instance>
spring.neo4j.authentication.username=<your_username>
spring.neo4j.authentication.password=<your_password>
# API key if needed, e.g. OpenAI
spring.ai.openai.api.key=<api-key>
环境变量,
environment variables,
export SPRING_NEO4J_URI=<uri_for_your_neo4j_instance>
export SPRING_NEO4J_AUTHENTICATION_USERNAME=<your_username>
export SPRING_NEO4J_AUTHENTICATION_PASSWORD=<your_password>
# API key if needed, e.g. OpenAI
export SPRING_AI_OPENAI_API_KEY=<api-key>
或可以将它们混合在一起。例如,如果您想将您的 API 密钥存储为环境变量,但将其余部分保留在普通 application.properties 文件中。
or can be a mix of those. For example, if you want to store your API key as an environment variable but keep the rest in the plain application.properties file.
如果您选择创建 Shell 脚本以方便以后的工作,请务必在通过“获取”文件来启动应用程序之前运行该脚本,即 |
If you choose to create a shell script for ease in future work, be sure to run it prior to starting your application by "sourcing" the file, i.e. |
除了 application.properties 和环境变量之外,Spring Boot 还提供 additional configuration options。 |
Besides application.properties and environment variables, Spring Boot offers additional configuration options. |
Neo4j 驱动程序的 Spring Boot 自动配置功能将创建一个 bean 实例,该实例将被 Neo4jVectorStore
使用。
Spring Boot’s auto-configuration feature for the Neo4j Driver will create a bean instance that will be used by the Neo4jVectorStore
.
Auto-configuration
Spring AI 为 Neo4j Vector Sore 提供 Spring Boot 自动配置。要启用它,请将以下依赖项添加到项目的 Maven pom.xml
文件:
Spring AI provides Spring Boot auto-configuration for the Neo4j Vector Sore.
To enable it, add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-neo4j-store-spring-boot-starter</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-neo4j-store-spring-boot-starter'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
请查看 [配置参数,_neo4jvectorstore_properties] 矢量存储列表以了解默认值和配置选项。
Please have a look at the list of _neo4jvectorstore_properties for the vector store to learn about the default values and configuration options.
请参阅 Repositories 部分,将 Milestone 和/或快照存储库添加到您的构建文件中。 |
Refer to the Repositories section to add Milestone and/or Snapshot Repositories to your build file. |
此外,您还需要一个经过配置的 EmbeddingClient
bean。请参阅 EmbeddingClient 部分了解更多信息。
Additionally, you will need a configured EmbeddingClient
bean. Refer to the EmbeddingClient section for more information.
以下是所需 bean 的示例:
Here is an example of the needed bean:
@Bean
public EmbeddingClient embeddingClient() {
// Can be any other EmbeddingClient implementation.
return new OpenAiEmbeddingClient(new OpenAiApi(System.getenv("SPRING_AI_OPENAI_API_KEY")));
}
在 Spring Boot 自动配置的 Neo4j Driver
bean 不是您想要或需要的情况下,您仍然可以定义自己的 bean。请阅读 Neo4j Java Driver reference 以获取有关自定义驱动程序配置的更深入信息。
In cases where the Spring Boot auto-configured Neo4j Driver
bean is not what you want or need, you can still define your own bean.
Please read the Neo4j Java Driver reference for more in-depth information about the configuration of a custom driver.
@Bean
public Driver driver() {
return GraphDatabase.driver("neo4j://<host>:<bolt-port>",
AuthTokens.basic("<username>", "<password>"));
}
现在,您可以在应用程序中将 Neo4jVectorStore
自动连接为向量存储。
Now you can auto-wire the Neo4jVectorStore
as a vector store in your application.
Metadata filtering
您也可以将通用便携式 metadata filters 与 Neo4j 存储一起使用。
You can leverage the generic, portable metadata filters with Neo4j store as well.
例如,你可以使用文本表达式语言:
For example, you can use either the text expression language:
vectorStore.similaritySearch(
SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression("author in ['john', 'jill'] && 'article_type' == 'blog'"));
或使用 Filter.Expression
DSL 以编程方式:
or programmatically using the Filter.Expression
DSL:
FilterExpressionBuilder b = new FilterExpressionBuilder();
vectorStore.similaritySearch(SearchRequest.defaults()
.withQuery("The World")
.withTopK(TOP_K)
.withSimilarityThreshold(SIMILARITY_THRESHOLD)
.withFilterExpression(b.and(
b.in("john", "jill"),
b.eq("article_type", "blog")).build()));
这些(可移植)筛选表达式将自动转换为专有的 Neo4j |
Those (portable) filter expressions get automatically converted into the proprietary Neo4j |
例如,此可移植的筛选器表达式:
For example, this portable filter expression:
author in ['john', 'jill'] && 'article_type' == 'blog'
将转换为专有的 Neo4j 筛选器格式:
is converted into the proprietary Neo4j filter format:
node.`metadata.author` IN ["john","jill"] AND node.`metadata.'article_type'` = "blog"
Neo4jVectorStore properties
您可以在 Spring Boot 配置中使用以下属性来自定义 Neo4j 向量存储。
You can use the following properties in your Spring Boot configuration to customize the Neo4j vector store.
Property | Default value |
---|---|
|
neo4j |
|
1536 |
|
cosine |
|
Document |
|
embedding |
|
spring-ai-document-index |