Pinecone

本部分将指导您设置 Pinecone VectorStore 以存储文档嵌入并执行相似性搜索。

This section walks you through setting up the Pinecone VectorStore to store document embeddings and perform similarity searches.

Pinecone是一个流行的基于云的向量数据库,它可以让您高效地存储和搜索向量。

Pinecone is a popular cloud-based vector database, which allows you to store and search vectors efficiently.

Prerequisites

  1. Pinecone 帐户:开始前,请注册 Pinecone account

  2. Pinecone Account: Before you start, sign up for a Pinecone account.

  3. 以下是使用Gemini将这段文字翻译成中文的结果:Pinecone 项目:注册后,生成 API 密钥并创建索引。您将需要这些详细信息进行配置。

  4. Pinecone Project: Once registered, generate an API key and create and index. You’ll need these details for configuration.

  5. 以下是使用Gemini将这段文本翻译成中文的结果:实例来计算文档嵌入。有几个选项可供选择:

    • 以下是用Gemini翻译成中文的文字:如果需要,可以使用 EmbeddingModel 的 API 密钥生成由 PineconeVectorStore 存储的嵌入。

    • If required, an API key for the EmbeddingModel to generate the embeddings stored by the PineconeVectorStore.

  1. EmbeddingModel instance to compute the document embeddings. Several options are available:

    • 以下是用Gemini翻译成中文的文字:如果需要,可以使用 EmbeddingModel 的 API 密钥生成由 PineconeVectorStore 存储的嵌入。

    • If required, an API key for the EmbeddingModel to generate the embeddings stored by the PineconeVectorStore.

为了设置 PineconeVectorStore,请从你的 Pinecone 账户中收集以下详细信息:

To set up PineconeVectorStore, gather the following details from your Pinecone account:

  • Pinecone API Key

  • Pinecone Index Name

  • Pinecone Namespace

以下是用Gemini翻译的中文版本:此信息可在 Pinecone UI 门户中获取。命名空间支持在 Pinecone 免费套餐中不可用。

This information is available to you in the Pinecone UI portal. The namespace support is not available in the Pinecone free tier.

Auto-configuration

Spring AI 自动配置、启动器模块的工件名称发生了重大变化。请参阅 upgrade notes 以获取更多信息。

There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names. Please refer to the upgrade notes for more information.

Spring AI 为 Pinecone Vector Store 提供 Spring Boot 自动配置。要启用它,请将以下依赖项添加到项目的 Maven pom.xml 文件中:

Spring AI provides Spring Boot auto-configuration for the Pinecone Vector Store. To enable it, add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-vector-store-pinecone</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-vector-store-pinecone'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

将Maven Central和/或Snapshot存储库添加到您的构建文件中,请参阅 Artifact Repositories 部分。

Refer to the Artifact Repositories section to add Maven Central and/or Snapshot Repositories to your build file.

此外,您还需要一个配置好的 EmbeddingModel bean。有关更多信息,请参阅 EmbeddingModel 部分。

Additionally, you will need a configured EmbeddingModel bean. Refer to the EmbeddingModel section for more information.

以下是所需 bean 的示例:

Here is an example of the needed bean:

@Bean
public EmbeddingModel embeddingModel() {
    // Can be any other EmbeddingModel implementation.
    return new OpenAiEmbeddingModel(new OpenAiApi(System.getenv("OPENAI_API_KEY")));
}

要连接到 Pinecone,您需要提供实例的访问详细信息。可以通过 Spring Boot 的 application.properties 提供简单的配置,

To connect to Pinecone you need to provide access details for your instance. A simple configuration can either be provided via Spring Boot’s application.properties,

spring.ai.vectorstore.pinecone.apiKey=<your api key>
spring.ai.vectorstore.pinecone.index-name=<your index name>

# API key if needed, e.g. OpenAI
spring.ai.openai.api.key=<api-key>

请查看矢量存储的 configuration parameters 列表,了解默认值和配置选项。

Please have a look at the list of _configuration_properties for the vector store to learn about the default values and configuration options.

现在您可以在应用程序中自动装配 Pinecone Vector Store 并使用它

Now you can Auto-wire the Pinecone Vector Store in your application and use it

@Autowired VectorStore vectorStore;

// ...

List <Document> documents = List.of(
    new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
    new Document("The World is Big and Salvation Lurks Around the Corner"),
    new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

// Add the documents
vectorStore.add(documents);

// Retrieve documents similar to a query
List<Document> results = this.vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());

Configuration properties

您可以在 Spring Boot 配置中使用以下属性来自定义 Pinecone 矢量存储。

You can use the following properties in your Spring Boot configuration to customize the Pinecone vector store.

Property Description Default value

spring.ai.vectorstore.pinecone.api-key

Pinecone API Key

-

spring.ai.vectorstore.pinecone.index-name

Pinecone index name

-

spring.ai.vectorstore.pinecone.namespace

Pinecone namespace

-

spring.ai.vectorstore.pinecone.content-field-name

Pinecone metadata field name used to store the original text content.

document_content

spring.ai.vectorstore.pinecone.distance-metadata-field-name

Pinecone metadata field name used to store the computed distance.

distance

spring.ai.vectorstore.pinecone.server-side-timeout

20 sec.

Metadata filtering

您可以利用通用、可移植的 metadata filters 与 Pinecone 存储。

You can leverage the generic, portable metadata filters with the Pinecone store.

例如,你可以使用文本表达式语言:

For example, you can use either the text expression language:

vectorStore.similaritySearch(
    SearchRequest.builder()
    .query("The World")
    .topK(TOP_K)
    .similarityThreshold(SIMILARITY_THRESHOLD)
    .filterExpression("author in ['john', 'jill'] && article_type == 'blog'").build());

或使用 Filter.Expression DSL 以编程方式:

or programmatically using the Filter.Expression DSL:

FilterExpressionBuilder b = new FilterExpressionBuilder();

vectorStore.similaritySearch(SearchRequest.builder()
    .query("The World")
    .topK(TOP_K)
    .similarityThreshold(SIMILARITY_THRESHOLD)
    .filterExpression(b.and(
        b.in("author","john", "jill"),
        b.eq("article_type", "blog")).build()).build());

这些过滤器表达式将转换为等效的 Pinecone 过滤器。

These filter expressions are converted into the equivalent Pinecone filters.

Manual Configuration

如果您更喜欢手动配置 PineconeVectorStore ,可以使用 PineconeVectorStore#Builder

If you prefer to configure PineconeVectorStore manually, you can do so by using the PineconeVectorStore#Builder.

将这些依赖项添加到你的项目中:

Add these dependencies to your project:

  • OpenAI:用于计算嵌入。

  • OpenAI: Required for calculating embeddings.

<dependency>
	<groupId>org.springframework.ai</groupId>
	<artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>
  • Pinecone

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pinecone-store</artifactId>
</dependency>
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Sample Code

为了在你的应用程序中配置 Pinecone,你可以使用以下设置:

To configure Pinecone in your application, you can use the following setup:

@Bean
public VectorStore pineconeVectorStore(EmbeddingModel embeddingModel) {
    return PineconeVectorStore.builder(embeddingModel)
            .apiKey(PINECONE_API_KEY)
            .indexName(PINECONE_INDEX_NAME)
            .namespace(PINECONE_NAMESPACE) // the free tier doesn't support namespaces.
            .contentFieldName(CUSTOM_CONTENT_FIELD_NAME) // optional field to store the original content. Defaults to `document_content`
            .build();
}

在你的主代码中,创建一些文档:

In your main code, create some documents:

List<Document> documents = List.of(
	new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
	new Document("The World is Big and Salvation Lurks Around the Corner"),
	new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));

将这些文档添加到 Pinecone:

Add the documents to Pinecone:

vectorStore.add(documents);

最后,检索与查询类似的文档:

And finally, retrieve documents similar to a query:

List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").topK(5).build());

如果一切都顺利,你应该检索包含文本 “Spring AI rocks!!” 的文档。

If all goes well, you should retrieve the document containing the text "Spring AI rocks!!".

Accessing the Native Client

Pinecone Vector Store 实现通过 getNativeClient() 方法提供对底层原生 Pinecone 客户端 ( PineconeConnection ) 的访问:

The Pinecone Vector Store implementation provides access to the underlying native Pinecone client (PineconeConnection) through the getNativeClient() method:

PineconeVectorStore vectorStore = context.getBean(PineconeVectorStore.class);
Optional<PineconeConnection> nativeClient = vectorStore.getNativeClient();

if (nativeClient.isPresent()) {
    PineconeConnection client = nativeClient.get();
    // Use the native client for Pinecone-specific operations
}

原生客户端允许您访问可能未通过 VectorStore 接口公开的 Pinecone 特定功能和操作。

The native client gives you access to Pinecone-specific features and operations that might not be exposed through the VectorStore interface.