Vector Databases
向量数据库是一种特殊类型的数据库,在 AI 应用程序中扮演着重要角色。
A vector database is a specialized type of database that plays an essential role in AI applications.
在向量数据库中,查询与传统的关联数据库不同。它并不执行精确匹配,而是执行相似性搜索。当将向量作为一个查询提供时,向量数据库会返回与查询向量 “similar” 的向量。在 Vector Similarity 中以高层次提供了有关如何计算此相似性的进一步详细信息。
In vector databases, queries differ from traditional relational databases. Instead of exact matches, they perform similarity searches. When given a vector as a query, a vector database returns vectors that are “similar” to the query vector. Further details on how this similarity is calculated at a high-level is provided in a Vector Similarity.
向量数据库用于将您的数据与 AI 模型集成在一起。使用它们的第一步是将您的数据加载到向量数据库中。然后,当要将用户的查询发送到 AI 模型时,首先检索一组相似的文档。然后,这些文档充当用户问题的上下文,并与用户的查询一起发送到 AI 模型。此技术称为 Retrieval Augmented Generation (RAG)。
Vector databases are used to integrate your data with AI models. The first step in their usage is to load your data into a vector database. Then, when a user query is to be sent to the AI model, a set of similar documents is first retrieved. These documents then serve as the context for the user’s question and are sent to the AI model, along with the user’s query. This technique is known as Retrieval Augmented Generation (RAG).
以下部分描述了用于使用多个矢量数据库实现以及一些高级样本用法 Spring AI 接口。
The following sections describe the Spring AI interface for using multiple vector database implementations and some high-level sample usage.
最后一部分旨在揭开矢量数据库中相似性搜索的底层方法的神秘面纱。
The last section is intended to demystify the underlying approach of similarity searching in vector databases.
API Overview
本节充当 VectorStore
接口及其在 Spring AI 框架内的相关类的一个指南。
This section serves as a guide to the VectorStore
interface and its associated classes within the Spring AI framework.
Spring AI 通过 VectorStore
接口提供一个抽象的 API,用于与矢量数据库进行交互。
Spring AI offers an abstracted API for interacting with vector databases through the VectorStore
interface.
以下是 VectorStore
接口定义:
Here is the VectorStore
interface definition:
public interface VectorStore extends DocumentWriter {
default String getName() {
return this.getClass().getSimpleName();
}
void add(List<Document> documents);
void delete(List<String> idList);
void delete(Filter.Expression filterExpression);
default void delete(String filterExpression) { ... };
List<Document> similaritySearch(String query);
List<Document> similaritySearch(SearchRequest request);
default <T> Optional<T> getNativeClient() {
return Optional.empty();
}
}
和相关的 SearchRequest
生成器:
and the related SearchRequest
builder:
public class SearchRequest {
public static final double SIMILARITY_THRESHOLD_ACCEPT_ALL = 0.0;
public static final int DEFAULT_TOP_K = 4;
private String query = "";
private int topK = DEFAULT_TOP_K;
private double similarityThreshold = SIMILARITY_THRESHOLD_ACCEPT_ALL;
@Nullable
private Filter.Expression filterExpression;
public static Builder from(SearchRequest originalSearchRequest) {
return builder().query(originalSearchRequest.getQuery())
.topK(originalSearchRequest.getTopK())
.similarityThreshold(originalSearchRequest.getSimilarityThreshold())
.filterExpression(originalSearchRequest.getFilterExpression());
}
public static class Builder {
private final SearchRequest searchRequest = new SearchRequest();
public Builder query(String query) {
Assert.notNull(query, "Query can not be null.");
this.searchRequest.query = query;
return this;
}
public Builder topK(int topK) {
Assert.isTrue(topK >= 0, "TopK should be positive.");
this.searchRequest.topK = topK;
return this;
}
public Builder similarityThreshold(double threshold) {
Assert.isTrue(threshold >= 0 && threshold <= 1, "Similarity threshold must be in [0,1] range.");
this.searchRequest.similarityThreshold = threshold;
return this;
}
public Builder similarityThresholdAll() {
this.searchRequest.similarityThreshold = 0.0;
return this;
}
public Builder filterExpression(@Nullable Filter.Expression expression) {
this.searchRequest.filterExpression = expression;
return this;
}
public Builder filterExpression(@Nullable String textExpression) {
this.searchRequest.filterExpression = (textExpression != null)
? new FilterExpressionTextParser().parse(textExpression) : null;
return this;
}
public SearchRequest build() {
return this.searchRequest;
}
}
public String getQuery() {...}
public int getTopK() {...}
public double getSimilarityThreshold() {...}
public Filter.Expression getFilterExpression() {...}
}
要将数据插入到矢量数据库中,请将其封装在一个 Document
对象中。Document
类封装来自数据源的内容,例如 PDF 或 Word 文档,并包含作为字符串表示的文本。它还包含 key-value 形式的元数据,包括文件名等详细信息。
To insert data into the vector database, encapsulate it within a Document
object.
The Document
class encapsulates content from a data source, such as a PDF or Word document, and includes text represented as a string.
It also contains metadata in the form of key-value pairs, including details such as the filename.
插入向量数据库后,文本内容会使用嵌入模型转换为数值数组或 float[]
,称为向量嵌入。嵌入模型,例如 Word2Vec 、 GLoVE 和 BERT ,或 OpenAI 的 text-embedding-ada-002
,用于将单词、句子或段落转换为这些向量嵌入。
Upon insertion into the vector database, the text content is transformed into a numerical array, or a float[]
, known as vector embeddings, using an embedding model. Embedding models, such as Word2Vec, GLoVE, and BERT, or OpenAI’s text-embedding-ada-002
, are used to convert words, sentences, or paragraphs into these vector embeddings.
向量数据库的作用是存储并促进这些嵌入的相似性搜索。它本身不生成嵌入。对于创建向量嵌入,应使用 EmbeddingModel
。
The vector database’s role is to store and facilitate similarity searches for these embeddings. It does not generate the embeddings itself. For creating vector embeddings, the EmbeddingModel
should be utilized.
该接口中的 similaritySearch
方法允许检索与给定查询字符串相似的文档。可以通过使用以下参数来微调这些方法:
The similaritySearch
methods in the interface allow for retrieving documents similar to a given query string. These methods can be fine-tuned by using the following parameters:
-
k
:指定要返回的相似文档的最大数量的整数。这通常被称为“前 K”搜索或“K 最近邻”(KNN)。 -
k
: An integer that specifies the maximum number of similar documents to return. This is often referred to as a 'top K' search, or 'K nearest neighbors' (KNN). -
threshold
:介于 0 到 1 之间的双精度值,接近 1 的值表示更高的相似性。举例而言,默认情况下,如果您设置 0.75 的阈值,则只返回相似度高于此值的文档。 -
threshold
: A double value ranging from 0 to 1, where values closer to 1 indicate higher similarity. By default, if you set a threshold of 0.75, for instance, only documents with a similarity above this value are returned. -
Filter.Expression
:用于传递流畅的 DSL(领域特定语言)表达式的类,其作用类似于 SQL 中的“where”子句,但它仅应用于Document
的元数据键值对。 -
Filter.Expression
: A class used for passing a fluent DSL (Domain-Specific Language) expression that functions similarly to a 'where' clause in SQL, but it applies exclusively to the metadata key-value pairs of aDocument
. -
filterExpression
:基于 ANTLR4 的外部 DSL,接受字符串作为过滤器表达式。例如,对于诸如国家、年份和isActive
等元数据键,您可以使用如下表达式:country == 'UK' && year >= 2020 && isActive == true.
-
filterExpression
: An external DSL based on ANTLR4 that accepts filter expressions as strings. For example, with metadata keys like country, year, andisActive
, you could use an expression such as:country == 'UK' && year >= 2020 && isActive == true.
在 Metadata Filters 部分中查找有关 Filter.Expression
的更多信息。
Find more information on the Filter.Expression
in the Metadata Filters section.
Schema Initialization
一些向量存储需要在使用前初始化其后端模式。默认情况下不会为您初始化。您必须通过为相应的构造函数参数传递 boolean
,或者,如果使用 Spring Boot,在 application.properties
或 application.yml
中将相应的 initialize-schema
属性设置为 true
来选择启用。请查看您正在使用的向量存储的文档,了解具体的属性名称。
Some vector stores require their backend schema to be initialized before usage.
It will not be initialized for you by default.
You must opt-in, by passing a boolean
for the appropriate constructor argument or, if using Spring Boot, setting the appropriate initialize-schema
property to true
in application.properties
or application.yml
.
Check the documentation for the vector store you are using for the specific property name.
Batching Strategy
在使用向量存储时,通常需要嵌入大量文档。虽然一次性调用嵌入所有文档似乎很简单,但这种方法可能会导致问题。嵌入模型将文本作为标记处理,并具有最大标记限制,通常称为上下文窗口大小。此限制限制了单个嵌入请求中可以处理的文本量。尝试在一次调用中嵌入过多的标记可能会导致错误或截断的嵌入。
When working with vector stores, it’s often necessary to embed large numbers of documents. While it might seem straightforward to make a single call to embed all documents at once, this approach can lead to issues. Embedding models process text as tokens and have a maximum token limit, often referred to as the context window size. This limit restricts the amount of text that can be processed in a single embedding request. Attempting to embed too many tokens in one call can result in errors or truncated embeddings.
为了解决此标记限制,Spring AI 实现了批处理策略。这种方法将大批文档分解为适合嵌入模型最大上下文窗口的较小批次。批处理不仅解决了标记限制问题,还可以提高性能并更有效地利用 API 速率限制。
To address this token limit, Spring AI implements a batching strategy. This approach breaks down large sets of documents into smaller batches that fit within the embedding model’s maximum context window. Batching not only solves the token limit issue but can also lead to improved performance and more efficient use of API rate limits.
Spring AI 通过 BatchingStrategy
接口提供此功能,该接口允许根据文档的标记计数以子批次处理文档。
Spring AI provides this functionality through the BatchingStrategy
interface, which allows for processing documents in sub-batches based on their token counts.
核心 BatchingStrategy
接口定义如下:
The core BatchingStrategy
interface is defined as follows:
public interface BatchingStrategy {
List<List<Document>> batch(List<Document> documents);
}
此接口定义了一个方法 batch
,该方法接受文档列表并返回文档批次列表。
This interface defines a single method, batch
, which takes a list of documents and returns a list of document batches.
Default Implementation
Spring AI 提供了一个名为 TokenCountBatchingStrategy
的默认实现。此策略根据文档的标记计数对文档进行批处理,确保每个批次不超过计算出的最大输入标记计数。
Spring AI provides a default implementation called TokenCountBatchingStrategy
.
This strategy batches documents based on their token counts, ensuring that each batch does not exceed a calculated maximum input token count.
TokenCountBatchingStrategy
的主要功能:
Key features of TokenCountBatchingStrategy
:
-
使用 OpenAI’s max input token count (8191) 作为默认上限。
-
Uses OpenAI’s max input token count (8191) as the default upper limit.
-
包含保留百分比(默认为 10%)以提供潜在开销的缓冲区。
-
Incorporates a reserve percentage (default 10%) to provide a buffer for potential overhead.
-
将实际最大输入标记计数计算为:
actualMaxInputTokenCount = originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE)
-
Calculates the actual max input token count as:
actualMaxInputTokenCount = originalMaxInputTokenCount * (1 - RESERVE_PERCENTAGE)
该策略估计每个文档的标记计数,将它们分组到不超过最大输入标记计数的批次中,如果单个文档超过此限制,则抛出异常。
The strategy estimates the token count for each document, groups them into batches without exceeding the max input token count, and throws an exception if a single document exceeds this limit.
您还可以自定义 TokenCountBatchingStrategy
以更好地满足您的特定要求。这可以通过在 Spring Boot @Configuration
类中创建具有自定义参数的新实例来完成。
You can also customize the TokenCountBatchingStrategy
to better suit your specific requirements. This can be done by creating a new instance with custom parameters in a Spring Boot @Configuration
class.
以下是如何创建自定义 TokenCountBatchingStrategy
Bean 的示例:
Here’s an example of how to create a custom TokenCountBatchingStrategy
bean:
@Configuration
public class EmbeddingConfig {
@Bean
public BatchingStrategy customTokenCountBatchingStrategy() {
return new TokenCountBatchingStrategy(
EncodingType.CL100K_BASE, // Specify the encoding type
8000, // Set the maximum input token count
0.1 // Set the reserve percentage
);
}
}
在此配置中:
In this configuration:
-
EncodingType.CL100K_BASE
:指定用于标记化的编码类型。此编码类型由JTokkitTokenCountEstimator
使用,以准确估计标记计数。 -
EncodingType.CL100K_BASE
: Specifies the encoding type used for tokenization. This encoding type is used by theJTokkitTokenCountEstimator
to accurately estimate token counts. -
8000
:设置最大输入标记计数。此值应小于或等于您的嵌入模型的最大上下文窗口大小。 -
8000
: Sets the maximum input token count. This value should be less than or equal to the maximum context window size of your embedding model. -
0.1
:设置保留百分比。从最大输入标记计数中保留的标记百分比。这为处理过程中潜在的标记计数增加创建了缓冲区。 -
0.1
: Sets the reserve percentage. The percentage of tokens to reserve from the max input token count. This creates a buffer for potential token count increases during processing.
默认情况下,此构造函数使用 Document.DEFAULT_CONTENT_FORMATTER
进行内容格式化,使用 MetadataMode.NONE
进行元数据处理。如果需要自定义这些参数,可以使用带附加参数的完整构造函数。
By default, this constructor uses Document.DEFAULT_CONTENT_FORMATTER
for content formatting and MetadataMode.NONE
for metadata handling. If you need to customize these parameters, you can use the full constructor with additional parameters.
一旦定义,这个自定义的 TokenCountBatchingStrategy
bean 将被您的应用程序中的 EmbeddingModel
实现自动使用,取代默认策略。
Once defined, this custom TokenCountBatchingStrategy
bean will be automatically used by the EmbeddingModel
implementations in your application, replacing the default strategy.
TokenCountBatchingStrategy
内部使用 TokenCountEstimator
(具体来说是 JTokkitTokenCountEstimator
)来计算令牌计数,以实现高效批处理。这确保了根据指定编码类型进行准确的令牌估算。
The TokenCountBatchingStrategy
internally uses a TokenCountEstimator
(specifically, JTokkitTokenCountEstimator
) to calculate token counts for efficient batching. This ensures accurate token estimation based on the specified encoding type.
此外, TokenCountBatchingStrategy
通过允许您传入自己的 TokenCountEstimator
接口实现来提供灵活性。此功能使您能够使用根据您的特定需求量身定制的自定义令牌计数策略。例如:
Additionally, TokenCountBatchingStrategy
provides flexibility by allowing you to pass in your own implementation of the TokenCountEstimator
interface. This feature enables you to use custom token counting strategies tailored to your specific needs. For example:
TokenCountEstimator customEstimator = new YourCustomTokenCountEstimator();
TokenCountBatchingStrategy strategy = new TokenCountBatchingStrategy(
this.customEstimator,
8000, // maxInputTokenCount
0.1, // reservePercentage
Document.DEFAULT_CONTENT_FORMATTER,
MetadataMode.NONE
);
Working with Auto-Truncation
一些嵌入模型,例如 Vertex AI 文本嵌入,支持 auto_truncate
功能。启用时,模型会静默截断超出最大大小的文本输入并继续处理;禁用时,它会针对过大的输入抛出明确的错误。
Some embedding models, such as Vertex AI text embedding, support an auto_truncate
feature. When enabled, the model silently truncates text inputs that exceed the maximum size and continues processing; when disabled, it throws an explicit error for inputs that are too large.
当将自动截断与批处理策略结合使用时,您必须将批处理策略配置为具有远高于模型实际最大值的输入令牌计数。这可以防止批处理策略因大文档而引发异常,从而允许嵌入模型在内部处理截断。
When using auto-truncation with the batching strategy, you must configure your batching strategy with a much higher input token count than the model’s actual maximum. This prevents the batching strategy from raising exceptions for large documents, allowing the embedding model to handle truncation internally.
Configuration for Auto-Truncation
启用自动截断时,将批处理策略的最大输入令牌计数设置得远高于模型的实际限制。这可以防止批处理策略因大文档而引发异常,从而允许嵌入模型在内部处理截断。
When enabling auto-truncation, set your batching strategy’s maximum input token count much higher than the model’s actual limit. This prevents the batching strategy from raising exceptions for large documents, allowing the embedding model to handle truncation internally.
以下是使用 Vertex AI 进行自动截断和自定义 BatchingStrategy
,然后将其用于 PgVectorStore 的示例配置:
Here’s an example configuration for using Vertex AI with auto-truncation and custom BatchingStrategy
and then using them in the PgVectorStore:
@Configuration
public class AutoTruncationEmbeddingConfig {
@Bean
public VertexAiTextEmbeddingModel vertexAiEmbeddingModel(
VertexAiEmbeddingConnectionDetails connectionDetails) {
VertexAiTextEmbeddingOptions options = VertexAiTextEmbeddingOptions.builder()
.model(VertexAiTextEmbeddingOptions.DEFAULT_MODEL_NAME)
.autoTruncate(true) // Enable auto-truncation
.build();
return new VertexAiTextEmbeddingModel(connectionDetails, options);
}
@Bean
public BatchingStrategy batchingStrategy() {
// Only use a high token limit if auto-truncation is enabled in your embedding model.
// Set a much higher token count than the model actually supports
// (e.g., 132,900 when Vertex AI supports only up to 20,000)
return new TokenCountBatchingStrategy(
EncodingType.CL100K_BASE,
132900, // Artificially high limit
0.1 // 10% reserve
);
}
@Bean
public VectorStore vectorStore(JdbcTemplate jdbcTemplate, EmbeddingModel embeddingModel, BatchingStrategy batchingStrategy) {
return PgVectorStore.builder(jdbcTemplate, embeddingModel)
// other properties omitted here
.build();
}
}
在此配置中:
In this configuration:
-
嵌入模型已启用自动截断,允许其优雅地处理超大输入。
-
The embedding model has auto-truncation enabled, allowing it to handle oversized inputs gracefully.
-
批处理策略使用人工高令牌限制 (132,900),这远大于实际模型限制 (20,000)。
-
The batching strategy uses an artificially high token limit (132,900) that’s much larger than the actual model limit (20,000).
-
向量存储使用配置的嵌入模型和自定义
BatchingStrategy
bean。 -
The vector store uses the configured embedding model and the custom
BatchingStrategy
bean.
Why This Works
这种方法有效,因为:
This approach works because:
-
TokenCountBatchingStrategy
检查是否有任何单个文档超出配置的最大值,如果超出,则抛出IllegalArgumentException
。 -
The
TokenCountBatchingStrategy
checks if any single document exceeds the configured maximum and throws anIllegalArgumentException
if it does. -
通过在批处理策略中设置一个非常高的限制,我们确保此检查永远不会失败。
-
By setting a very high limit in the batching strategy, we ensure that this check never fails.
-
超出模型限制的文档或批次会被嵌入模型的自动截断功能静默截断和处理。
-
Documents or batches exceeding the model’s limit are silently truncated and processed by the embedding model’s auto-truncation feature.
Best Practices
使用自动截断时:
When using auto-truncation:
-
将批处理策略的最大输入令牌计数设置为至少比模型的实际限制大 5-10 倍,以避免批处理策略过早地引发异常。
-
Set the batching strategy’s max input token count to be at least 5-10x larger than the model’s actual limit to avoid premature exceptions from the batching strategy.
-
监视您的日志中来自嵌入模型的截断警告(注意:并非所有模型都记录截断事件)。
-
Monitor your logs for truncation warnings from the embedding model (note: not all models log truncation events).
-
考虑静默截断对您的嵌入质量的影响。
-
Consider the implications of silent truncation on your embedding quality.
-
使用示例文档进行测试,以确保截断的嵌入仍然符合您的要求。
-
Test with sample documents to ensure truncated embeddings still meet your requirements.
-
记录此配置以供将来的维护人员使用,因为它是非标准的。
-
Document this configuration for future maintainers, as it is non-standard.
虽然自动截断可以防止错误,但它可能导致不完整的嵌入。长文档末尾的重要信息可能会丢失。如果您的应用程序要求嵌入所有内容,请在嵌入之前将文档分成更小的块。
While auto-truncation prevents errors, it can result in incomplete embeddings. Important information at the end of long documents may be lost. If your application requires all content to be embedded, split documents into smaller chunks before embedding.
Spring Boot Auto-Configuration
如果您正在使用 Spring Boot 自动配置,则必须提供一个自定义的 BatchingStrategy
bean 来覆盖 Spring AI 附带的默认 bean:
If you’re using Spring Boot auto-configuration, you must provide a custom BatchingStrategy
bean to override the default one that comes with Spring AI:
@Bean
public BatchingStrategy customBatchingStrategy() {
// This bean will override the default BatchingStrategy
return new TokenCountBatchingStrategy(
EncodingType.CL100K_BASE,
132900, // Much higher than model's actual limit
0.1
);
}
应用程序上下文中存在此 bean 将自动替换所有向量存储使用的默认批处理策略。
The presence of this bean in your application context will automatically replace the default batching strategy used by all vector stores.
Custom Implementation
虽然 TokenCountBatchingStrategy
提供了健壮的默认实现,但您可以自定义批处理策略以适应您的特定需求。这可以通过 Spring Boot 的自动配置来完成。
While TokenCountBatchingStrategy
provides a robust default implementation, you can customize the batching strategy to fit your specific needs.
This can be done through Spring Boot’s auto-configuration.
要自定义批处理策略,请在您的 Spring Boot 应用程序中定义一个 BatchingStrategy
bean:
To customize the batching strategy, define a BatchingStrategy
bean in your Spring Boot application:
@Configuration
public class EmbeddingConfig {
@Bean
public BatchingStrategy customBatchingStrategy() {
return new CustomBatchingStrategy();
}
}
此自定义 BatchingStrategy
将随后由您的应用程序中的 EmbeddingModel
实现自动使用。
This custom BatchingStrategy
will then be automatically used by the EmbeddingModel
implementations in your application.
Spring AI 支持的向量存储配置为使用默认的 |
Vector stores supported by Spring AI are configured to use the default |
VectorStore Implementations
以下是 VectorStore
接口的可用实现:
These are the available implementations of the VectorStore
interface:
-
Azure Vector Search - Azure 向量存储。
-
Azure Vector Search - The Azure vector store.
-
Apache Cassandra - Apache Cassandra 向量存储。
-
Apache Cassandra - The Apache Cassandra vector store.
-
Chroma Vector Store - Chroma 向量存储。
-
Chroma Vector Store - The Chroma vector store.
-
Elasticsearch Vector Store - The Elasticsearch vector store.
-
GemFire Vector Store - GemFire 向量存储。
-
GemFire Vector Store - The GemFire vector store.
-
MariaDB Vector Store - MariaDB 向量存储。
-
MariaDB Vector Store - The MariaDB vector store.
-
Milvus Vector Store - Milvus 向量存储。
-
Milvus Vector Store - The Milvus vector store.
-
MongoDB Atlas Vector Store - The MongoDB Atlas vector store.
-
Neo4j Vector Store - Neo4j 向量存储。
-
Neo4j Vector Store - The Neo4j vector store.
-
OpenSearch Vector Store - OpenSearch 向量存储。
-
OpenSearch Vector Store - The OpenSearch vector store.
-
Oracle Vector Store - Oracle Database 向量存储。
-
Oracle Vector Store - The Oracle Database vector store.
-
PgVector Store - PostgreSQL/PGVector 向量存储。
-
PgVector Store - The PostgreSQL/PGVector vector store.
-
Pinecone Vector Store - PineCone 向量存储。
-
Pinecone Vector Store - PineCone vector store.
-
Qdrant Vector Store - Qdrant 向量存储。
-
Qdrant Vector Store - Qdrant vector store.
-
Redis Vector Store - Redis 向量存储。
-
Redis Vector Store - The Redis vector store.
-
SAP Hana Vector Store - SAP HANA 向量存储。
-
SAP Hana Vector Store - The SAP HANA vector store.
-
Typesense Vector Store - Typesense 向量存储。
-
Typesense Vector Store - The Typesense vector store.
-
Weaviate Vector Store - Weaviate 向量存储。
-
Weaviate Vector Store - The Weaviate vector store.
-
SimpleVectorStore - 持久向量存储的简单实现,适用于教育目的。
-
SimpleVectorStore - A simple implementation of persistent vector storage, good for educational purposes.
以后的版本中可能会支持更多实现。
More implementations may be supported in future releases.
如果你有一个需要 Spring AI 支持的矢量数据库,请在 GitHub 上提出问题,或者更好的是,提交一个带有实现的 pull 请求。
If you have a vector database that needs to be supported by Spring AI, open an issue on GitHub or, even better, submit a pull request with an implementation.
可以在本章的子部分中找到有关每个 VectorStore
实现的信息。
Information on each of the VectorStore
implementations can be found in the subsections of this chapter.
Example Usage
要计算矢量数据库的嵌入,您需要选择与正在使用的更高级别 AI 模型相匹配的嵌入模型。
To compute the embeddings for a vector database, you need to pick an embedding model that matches the higher-level AI model being used.
例如,对于 OpenAI 的 ChatGPT,我们使用 OpenAiEmbeddingModel
和一个名为 text-embedding-ada-002
的模型。
For example, with OpenAI’s ChatGPT, we use the OpenAiEmbeddingModel
and a model named text-embedding-ada-002
.
Spring Boot starter 对 OpenAI 的自动配置使得 EmbeddingModel
的实现在 Spring 应用程序上下文中可用,用于依赖注入。
The Spring Boot starter’s auto-configuration for OpenAI makes an implementation of EmbeddingModel
available in the Spring application context for dependency injection.
将数据加载到向量存储中的常规用法是您在类似批处理的工作中要执行的操作,首先将数据加载到 Spring AI 的 Document
类中,然后调用 save
方法。
The general usage of loading data into a vector store is something you would do in a batch-like job, by first loading data into Spring AI’s Document
class and then calling the save
method.
给定一个引用 JSON 文件的 String
,该 JSON 文件包含我们要加载到向量数据库中的数据,我们使用 Spring AI 的 JsonReader
加载 JSON 中的特定字段,将其拆分为小部分,然后将这些小部分传递到向量存储实现中。 VectorStore
实现计算嵌入,并将 JSON 和嵌入存储在向量数据库中:
Given a String
reference to a source file that represents a JSON file with data we want to load into the vector database, we use Spring AI’s JsonReader
to load specific fields in the JSON, which splits them up into small pieces and then passes those small pieces to the vector store implementation.
The VectorStore
implementation computes the embeddings and stores the JSON and the embedding in the vector database:
@Autowired
VectorStore vectorStore;
void load(String sourceFile) {
JsonReader jsonReader = new JsonReader(new FileSystemResource(sourceFile),
"price", "name", "shortDescription", "description", "tags");
List<Document> documents = jsonReader.get();
this.vectorStore.add(documents);
}
稍后,当用户问题传入 AI 模型时,将执行相似性搜索以检索类似文档,然后将其“填充”到提示中作为用户问题的上下文。
Later, when a user question is passed into the AI model, a similarity search is done to retrieve similar documents, which are then "'stuffed'" into the prompt as context for the user’s question.
String question = <question from user>
List<Document> similarDocuments = store.similaritySearch(this.question);
可以将其他选项传递到 similaritySearch
方法中,以定义检索多少文档以及相似性搜索的阈值。
Additional options can be passed into the similaritySearch
method to define how many documents to retrieve and a threshold of the similarity search.
Metadata Filters
本部分介绍您可以对查询结果使用的各种筛选器。
This section describes various filters that you can use against the results of a query.
Filter String
您可以将类似 SQL 的筛选器表达式作为 String
传递给 similaritySearch
重载之一。
You can pass in an SQL-like filter expressions as a String
to one of the similaritySearch
overloads.
考虑以下示例:
Consider the following examples:
-
"country == 'BG'"
-
"genre == 'drama' && year >= 2020"
-
"genre == 'drama' && year >= 2020"
-
"genre in ['comedy', 'documentary', 'drama']"
Filter.Expression
您可以使用公开 FluentAPI 的 FilterExpressionBuilder
创建 Filter.Expression
实例。一个简单的示例如下:
You can create an instance of Filter.Expression
with a FilterExpressionBuilder
that exposes a fluent API.
A simple example is as follows:
FilterExpressionBuilder b = new FilterExpressionBuilder();
Expression expression = this.b.eq("country", "BG").build();
您可以使用以下运算符构建复杂表达式:
You can build up sophisticated expressions by using the following operators:
EQUALS: '=='
MINUS : '-'
PLUS: '+'
GT: '>'
GE: '>='
LT: '<'
LE: '<='
NE: '!='
您可以使用以下运算符组合表达式:
You can combine expressions by using the following operators:
AND: 'AND' | 'and' | '&&';
OR: 'OR' | 'or' | '||';
考虑到以下示例:
Considering the following example:
Expression exp = b.and(b.eq("genre", "drama"), b.gte("year", 2020)).build();
您还可以使用以下运算符:
You can also use the following operators:
IN: 'IN' | 'in';
NIN: 'NIN' | 'nin';
NOT: 'NOT' | 'not';
请考虑以下示例:
Consider the following example:
Expression exp = b.and(b.in("genre", "drama", "documentary"), b.not(b.lt("year", 2020))).build();
Deleting Documents from Vector Store
向量存储接口提供了多种删除文档的方法,允许您通过特定的文档 ID 或使用筛选表达式删除数据。
The Vector Store interface provides multiple methods for deleting documents, allowing you to remove data either by specific document IDs or using filter expressions.
Delete by Document IDs
删除文档最简单的方法是提供文档 ID 列表:
The simplest way to delete documents is by providing a list of document IDs:
void delete(List<String> idList);
这个方法会移除所有ID与所提供列表中ID匹配的文档。如果列表中的任何ID在存储中不存在,则会被忽略。
This method removes all documents whose IDs match those in the provided list. If any ID in the list doesn’t exist in the store, it will be ignored.
// Create and add document
Document document = new Document("The World is Big",
Map.of("country", "Netherlands"));
vectorStore.add(List.of(document));
// Delete document by ID
vectorStore.delete(List.of(document.getId()));
Delete by Filter Expression
对于更复杂的删除条件,您可以使用筛选表达式:
For more complex deletion criteria, you can use filter expressions:
void delete(Filter.Expression filterExpression);
此方法接受一个 Filter.Expression
对象,该对象定义了应删除文档的条件。当您需要根据文档的元数据属性删除文档时,这尤其有用。
This method accepts a Filter.Expression
object that defines the criteria for which documents should be deleted.
It’s particularly useful when you need to delete documents based on their metadata properties.
// Create test documents with different metadata
Document bgDocument = new Document("The World is Big",
Map.of("country", "Bulgaria"));
Document nlDocument = new Document("The World is Big",
Map.of("country", "Netherlands"));
// Add documents to the store
vectorStore.add(List.of(bgDocument, nlDocument));
// Delete documents from Bulgaria using filter expression
Filter.Expression filterExpression = new Filter.Expression(
Filter.ExpressionType.EQ,
new Filter.Key("country"),
new Filter.Value("Bulgaria")
);
vectorStore.delete(filterExpression);
// Verify deletion with search
SearchRequest request = SearchRequest.builder()
.query("World")
.filterExpression("country == 'Bulgaria'")
.build();
List<Document> results = vectorStore.similaritySearch(request);
// results will be empty as Bulgarian document was deleted
Delete by String Filter Expression
为了方便起见,您也可以使用基于字符串的筛选表达式删除文档:
For convenience, you can also delete documents using a string-based filter expression:
void delete(String filterExpression);
此方法会在内部将提供的字符串筛选器转换为一个 Filter.Expression
对象。当您拥有字符串格式的筛选条件时,这非常有用。
This method converts the provided string filter into a Filter.Expression
object internally.
It’s useful when you have filter criteria in string format.
// Create and add documents
Document bgDocument = new Document("The World is Big",
Map.of("country", "Bulgaria"));
Document nlDocument = new Document("The World is Big",
Map.of("country", "Netherlands"));
vectorStore.add(List.of(bgDocument, nlDocument));
// Delete Bulgarian documents using string filter
vectorStore.delete("country == 'Bulgaria'");
// Verify remaining documents
SearchRequest request = SearchRequest.builder()
.query("World")
.topK(5)
.build();
List<Document> results = vectorStore.similaritySearch(request);
// results will only contain the Netherlands document
Error Handling When Calling the Delete API
所有删除方法在出现错误时都可能抛出异常:
All deletion methods may throw exceptions in case of errors:
最佳实践是将删除操作包装在try-catch块中:
The best practice is to wrap delete operations in try-catch blocks:
try {
vectorStore.delete("country == 'Bulgaria'");
}
catch (Exception e) {
logger.error("Invalid filter expression", e);
}
Document Versioning Use Case
一个常见的场景是管理文档版本,您需要上传文档的新版本,同时移除旧版本。以下是如何使用筛选表达式处理此问题:
A common scenario is managing document versions where you need to upload a new version of a document while removing the old version. Here’s how to handle this using filter expressions:
// Create initial document (v1) with version metadata
Document documentV1 = new Document(
"AI and Machine Learning Best Practices",
Map.of(
"docId", "AIML-001",
"version", "1.0",
"lastUpdated", "2024-01-01"
)
);
// Add v1 to the vector store
vectorStore.add(List.of(documentV1));
// Create updated version (v2) of the same document
Document documentV2 = new Document(
"AI and Machine Learning Best Practices - Updated",
Map.of(
"docId", "AIML-001",
"version", "2.0",
"lastUpdated", "2024-02-01"
)
);
// First, delete the old version using filter expression
Filter.Expression deleteOldVersion = new Filter.Expression(
Filter.ExpressionType.AND,
Arrays.asList(
new Filter.Expression(
Filter.ExpressionType.EQ,
new Filter.Key("docId"),
new Filter.Value("AIML-001")
),
new Filter.Expression(
Filter.ExpressionType.EQ,
new Filter.Key("version"),
new Filter.Value("1.0")
)
)
);
vectorStore.delete(deleteOldVersion);
// Add the new version
vectorStore.add(List.of(documentV2));
// Verify only v2 exists
SearchRequest request = SearchRequest.builder()
.query("AI and Machine Learning")
.filterExpression("docId == 'AIML-001'")
.build();
List<Document> results = vectorStore.similaritySearch(request);
// results will contain only v2 of the document
您也可以使用字符串筛选表达式完成相同操作:
You can also accomplish the same using the string filter expression:
// Delete old version using string filter
vectorStore.delete("docId == 'AIML-001' AND version == '1.0'");
// Add new version
vectorStore.add(List.of(documentV2));
Performance Considerations While Deleting Documents
-
当您确切知道要移除哪些文档时,按ID列表删除通常更快。
-
Deleting by ID list is generally faster when you know exactly which documents to remove.
-
基于筛选器的删除可能需要扫描索引才能找到匹配的文档;但是,这取决于向量存储的实现。
-
Filter-based deletion may require scanning the index to find matching documents; however, this is vector store implementation-specific.
-
大型删除操作应分批进行,以避免系统过载。
-
Large deletion operations should be batched to avoid overwhelming the system.
-
当基于文档属性删除而不是先收集ID时,请考虑使用筛选表达式。
-
Consider using filter expressions when deleting based on document properties rather than collecting IDs first.