Vector Databases

矢量数据库是一种专门类型的数据库，在 AI 应用程序中起着至关重要的作用。

A vector databases is a specialized type of database that plays an essential role in AI applications.

在向量数据库中，查询与传统的关联数据库不同。它并不执行精确匹配，而是执行相似性搜索。当将向量作为一个查询提供时，向量数据库会返回与查询向量 “similar” 的向量。在 Vector Similarity 中以高层次提供了有关如何计算此相似性的进一步详细信息。

In vector databases, queries differ from traditional relational databases. Instead of exact matches, they perform similarity searches. When given a vector as a query, a vector database returns vectors that are “similar” to the query vector. Further details on how this similarity is calculated at a high-level is provided in a Vector Similarity.

向量数据库用于将您的数据与 AI 模型集成在一起。使用它们的第一步是将您的数据加载到向量数据库中。然后，当要将用户的查询发送到 AI 模型时，首先检索一组相似的文档。然后，这些文档充当用户问题的上下文，并与用户的查询一起发送到 AI 模型。此技术称为 Retrieval Augmented Generation (RAG)。

Vector databases are used to integrate your data with AI models. The first step in their usage is to load your data into a vector database. Then, when a user query is to be sent to the AI model, a set of similar documents is first retrieved. These documents then serve as the context for the user’s question and are sent to the AI model, along with the user’s query. This technique is known as Retrieval Augmented Generation (RAG).

以下部分描述了用于使用多个矢量数据库实现以及一些高级样本用法 Spring AI 接口。

The following sections describe the Spring AI interface for using multiple vector database implementations and some high-level sample usage.

最后一部分旨在揭开矢量数据库中相似性搜索的底层方法的神秘面纱。

The last section is intended to demystify the underlying approach of similarity searching in vector databases.

API Overview

本节充当 VectorStore 接口及其在 Spring AI 框架内的相关类的一个指南。

This section serves as a guide to the VectorStore interface and its associated classes within the Spring AI framework.

Spring AI 通过 VectorStore 接口提供一个抽象的 API，用于与矢量数据库进行交互。

Spring AI offers an abstracted API for interacting with vector databases through the VectorStore interface.

以下是 VectorStore 接口定义：

Here is the VectorStore interface definition:

public interface VectorStore {

    void add(List<Document> documents);

    Optional<Boolean> delete(List<String> idList);

    List<Document> similaritySearch(String query);

    List<Document> similaritySearch(SearchRequest request);
}

和相关的 SearchRequest 生成器：

and the related SearchRequest builder:

public class SearchRequest {

	public final String query;
	private int topK = 4;
	private double similarityThreshold = SIMILARITY_THRESHOLD_ALL;
	private Filter.Expression filterExpression;

	public static SearchRequest query(String query) { return new SearchRequest(query); }

	private SearchRequest(String query) { this.query = query; }

	public SearchRequest withTopK(int topK) {...}
	public SearchRequest withSimilarityThreshold(double threshold) {...}
	public SearchRequest withSimilarityThresholdAll() {...}
	public SearchRequest withFilterExpression(Filter.Expression expression) {...}
	public SearchRequest withFilterExpression(String textExpression) {...}

	public String getQuery() {...}
	public int getTopK() {...}
	public double getSimilarityThreshold() {...}
	public Filter.Expression getFilterExpression() {...}
}

要将数据插入到矢量数据库中，请将其封装在一个 Document 对象中。Document 类封装来自数据源的内容，例如 PDF 或 Word 文档，并包含作为字符串表示的文本。它还包含 key-value 形式的元数据，包括文件名等详细信息。

To insert data into the vector database, encapsulate it within a Document object. The Document class encapsulates content from a data source, such as a PDF or Word document, and includes text represented as a string. It also contains metadata in the form of key-value pairs, including details such as the filename.

在插入到矢量数据库后，文本内容将使用嵌入模型转换为数字数组或 List<Double>，称为矢量嵌入。Word2Vec、GLoVE 和 BERT 等嵌入模型或 OpenAI 的 text-embedding-ada-002 用于将单词、句子或段落转换为这些矢量嵌入。

Upon insertion into the vector database, the text content is transformed into a numerical array, or a List<Double>, known as vector embeddings, using an embedding model. Embedding models, such as Word2Vec, GLoVE, and BERT, or OpenAI’s text-embedding-ada-002, are used to convert words, sentences, or paragraphs into these vector embeddings.

矢量数据库的作用是存储这些嵌入并为它们提供相似性搜索。它本身不生成嵌入。应利用 EmbeddingClient 来创建矢量嵌入。

The vector database’s role is to store and facilitate similarity searches for these embeddings. It does not generate the embeddings itself. For creating vector embeddings, the EmbeddingClient should be utilized.

该接口中的 similaritySearch 方法允许检索与给定查询字符串相似的文档。可以通过使用以下参数来微调这些方法：

The similaritySearch methods in the interface allow for retrieving documents similar to a given query string. These methods can be fine-tuned by using the following parameters:

k: An integer that specifies the maximum number of similar documents to return. This is often referred to as a 'top K' search, or 'K nearest neighbors' (KNN).
threshold: A double value ranging from 0 to 1, where values closer to 1 indicate higher similarity. By default, if you set a threshold of 0.75, for instance, only documents with a similarity above this value are returned.
Filter.Expression: A class used for passing a fluent DSL (Domain-Specific Language) expression that functions similarly to a 'where' clause in SQL, but it applies exclusively to the metadata key-value pairs of a Document.
filterExpression: An external DSL based on ANTLR4 that accepts filter expressions as strings. For example, with metadata keys like country, year, and isActive, you could use an expression such as: country == 'UK' && year >= 2020 && isActive == true.

在 Metadata Filters 部分中查找有关 Filter.Expression 的更多信息。

Find more information on the Filter.Expression in the Metadata Filters section.

Available Implementations

以下是 VectorStore 接口的可用实现：

These are the available implementations of the VectorStore interface:

Azure Vector Search - The Azure vector store.
ChromaVectorStore - The Chroma vector store.
MilvusVectorStore - The Milvus vector store.
Neo4jVectorStore - The Neo4j vector store.
PgVectorStore - The PostgreSQL/PGVector vector store.
PineconeVectorStore - PineCone vector store.
QdrantVectorStore - Qdrant vector store.
RedisVectorStore - The Redis vector store.
WeaviateVectorStore - The Weaviate vector store.
SimpleVectorStore - A simple implementation of persistent vector storage, good for educational purposes.

以后的版本中可能会支持更多实现。

More implementations may be supported in future releases.

如果你有一个需要 Spring AI 支持的矢量数据库，请在 GitHub 上提出问题，或者更好的是，提交一个带有实现的 pull 请求。

If you have a vector database that needs to be supported by Spring AI, open an issue on GitHub or, even better, submit a pull request with an implementation.

可以在本章的子部分中找到有关每个 VectorStore 实现的信息。

Information on each of the VectorStore implementations can be found in the subsections of this chapter.

Example Usage

要计算矢量数据库的嵌入，您需要选择与正在使用的更高级别 AI 模型相匹配的嵌入模型。

To compute the embeddings for a vector database, you need to pick an embedding model that matches the higher-level AI model being used.

例如，对于 OpenAI 的 ChatGPT，我们使用 OpenAiEmbeddingClient 和模型名称 text-embedding-ada-002。

For example, with OpenAI’s ChatGPT, we use the OpenAiEmbeddingClient and a model name of text-embedding-ada-002.

OpenAI 的 Spring Boot 启动器的自动配置使 EmbeddingClient 实现可用于 Spring 应用程序上下文中，以进行依赖项注入。

The Spring Boot starter’s auto-configuration for OpenAI makes an implementation of EmbeddingClient available in the Spring application context for dependency injection.

将数据加载到向量存储中的常规用法是您在类似批处理的工作中要执行的操作，首先将数据加载到 Spring AI 的 Document 类中，然后调用 save 方法。

The general usage of loading data into a vector store is something you would do in a batch-like job, by first loading data into Spring AI’s Document class and then calling the save method.

给定一个引用 JSON 文件的 String，该 JSON 文件包含我们要加载到向量数据库中的数据，我们使用 Spring AI 的 JsonReader 加载 JSON 中的特定字段，将其拆分为小部分，然后将这些小部分传递到向量存储实现中。 VectorStore 实现计算嵌入，并将 JSON 和嵌入存储在向量数据库中：

Given a String reference to a source file that represents a JSON file with data we want to load into the vector database, we use Spring AI’s JsonReader to load specific fields in the JSON, which splits them up into small pieces and then passes those small pieces to the vector store implementation. The VectorStore implementation computes the embeddings and stores the JSON and the embedding in the vector database:

  @Autowired
  VectorStore vectorStore;

  void load(String sourceFile) {
            JsonReader jsonReader = new JsonReader(new FileSystemResource(sourceFile),
                    "price", "name", "shortDescription", "description", "tags");
            List<Document> documents = jsonReader.get();
            this.vectorStore.add(documents);
  }

稍后，当用户问题传入 AI 模型时，将执行相似性搜索以检索类似文档，然后将其“填充”到提示中作为用户问题的上下文。

Later, when a user question is passed into the AI model, a similarity search is done to retrieve similar documents, which are then "'stuffed'" into the prompt as context for the user’s question.

   String question = <question from user>
   List<Document> similarDocuments = store.similaritySearch(question);

可以将其他选项传递到 similaritySearch 方法中，以定义检索多少文档以及相似性搜索的阈值。

Additional options can be passed into the similaritySearch method to define how many documents to retrieve and a threshold of the similarity search.

Metadata Filters

本部分介绍您可以对查询结果使用的各种筛选器。

This section describes various filters that you can use against the results of a query.

Filter String

您可以将类似 SQL 的筛选器表达式作为 String 传递给 similaritySearch 重载之一。

You can pass in an SQL-like filter expressions as a String to one of the similaritySearch overloads.

考虑以下示例：

Consider the following examples:

"country == 'BG'"
"genre == 'drama' && year >= 2020"
"genre in ['comedy', 'documentary', 'drama']"

Filter.Expression

您可以使用公开 FluentAPI 的 FilterExpressionBuilder 创建 Filter.Expression 实例。一个简单的示例如下：

You can create an instance of Filter.Expression with a FilterExpressionBuilder that exposes a fluent API. A simple example is as follows:

FilterExpressionBuilder b = new FilterExpressionBuilder();
Expression expression = b.eq("country", "BG").build();

您可以使用以下运算符构建复杂表达式：

You can build up sophisticated expressions by using the following operators:

EQUALS: '=='
MINUS : '-'
PLUS: '+'
GT: '>'
GE: '>='
LT: '<'
LE: '<='
NE: '!='

您可以使用以下运算符组合表达式：

You can combine expressions by using the following operators:

AND: 'AND' | 'and' | '&&';
OR: 'OR' | 'or' | '||';

考虑到以下示例：

Considering the following example:

Expression exp = b.and(b.eq("genre", "drama"), b.gte("year", 2020)).build();

您还可以使用以下运算符：

You can also use the following operators:

IN: 'IN' | 'in';
NIN: 'NIN' | 'nin';
NOT: 'NOT' | 'not';

请考虑以下示例：

Consider the following example:

Expression exp = b.and(b.eq("genre", "drama"), b.gte("year", 2020)).build();

Understanding Vectors

Understanding Vectors

Understanding Vectors