Google VertexAI Text Embeddings

Vertex AI 支持两种类型的嵌入模型,文本和多模态。本文档描述了如何使用 Vertex AI Text embeddings API 创建文本嵌入。

Vertex AI supports two types of embeddings models, text and multimodal. This document describes how to create a text embedding using the Vertex AI Text embeddings API.

Vertex AI 文本嵌入 API 使用密集向量表示。与倾向于将单词直接映射到数字的稀疏向量不同,密集向量旨在更好地表示一段文本的含义。在生成式 AI 中使用密集向量嵌入的好处是,您无需搜索直接的单词或语法匹配,而是可以更好地搜索与查询含义对齐的段落,即使这些段落不使用相同的语言。

Vertex AI text embeddings API uses dense vector representations. Unlike sparse vectors, which tend to directly map words to numbers, dense vectors are designed to better represent the meaning of a piece of text. The benefit of using dense vector embeddings in generative AI is that instead of searching for direct word or syntax matches, you can better search for passages that align to the meaning of the query, even if the passages don’t use the same language.

Prerequisites

  • 以下是使用Gemini将该文本翻译成中文的结果:安装适用于您操作系统的 gcloud CLI。

  • Install the gcloud CLI, appropriate for you OS.

  • 请运行以下命令进行身份验证。将 PROJECT_ID 替换为您的 Google Cloud 项目 ID,将 ACCOUNT 替换为您的 Google Cloud 用户名。

  • Authenticate by running the following command. Replace PROJECT_ID with your Google Cloud project ID and ACCOUNT with your Google Cloud username.

gcloud config set project <PROJECT_ID> &&
gcloud auth application-default login <ACCOUNT>

Add Repositories and BOM

Spring AI 工件发布在 Maven Central 和 Spring Snapshot 存储库中。请参阅“添加 Spring AI 仓库”部分,将这些仓库添加到您的构建系统。

Spring AI artifacts are published in Maven Central and Spring Snapshot repositories. Refer to the Artifact Repositories section to add these repositories to your build system.

为了帮助进行依赖项管理,Spring AI 提供了一个 BOM(物料清单)以确保在整个项目中使用一致版本的 Spring AI。有关将 Spring AI BOM 添加到你的构建系统的说明,请参阅 Dependency Management 部分。

To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.

Auto-configuration

Spring AI 自动配置、启动器模块的工件名称发生了重大变化。请参阅 upgrade notes 以获取更多信息。

There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names. Please refer to the upgrade notes for more information.

Spring AI 为 VertexAI 嵌入模型提供了 Spring Boot 自动配置。要启用它,请将以下依赖项添加到您的项目的 Maven pom.xml 文件中:

Spring AI provides Spring Boot auto-configuration for the VertexAI Embedding Model. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-vertex-ai-embedding'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Embedding Properties

前缀 spring.ai.vertex.ai.embedding 用作属性前缀,允许您连接到 VertexAI 嵌入 API。

The prefix spring.ai.vertex.ai.embedding is used as the property prefix that lets you connect to VertexAI Embedding API.

Property Description Default

spring.ai.vertex.ai.embedding.project-id

Google Cloud Platform project ID

-

spring.ai.vertex.ai.embedding.location

Region

-

spring.ai.vertex.ai.embedding.apiEndpoint

Vertex AI Embedding API endpoint.

-

嵌入自动配置的启用和禁用现在通过前缀为 spring.ai.azure.openai.embedding 的顶级属性进行配置。

Enabling and disabling of the embedding auto-configurations are now configured via top level properties with the prefix spring.ai.model.embedding.

要启用,spring.ai.model.embedding.text=vertexai(默认启用)

To enable, spring.ai.model.embedding.text=vertexai (It is enabled by default)

要禁用,spring.ai.model.embedding.text=none(或任何与 vertexai 不匹配的值)

To disable, spring.ai.model.embedding.text=none (or any value which doesn’t match vertexai)

此更改旨在允许配置多个模型。

This change is done to allow configuration of multiple models.

前缀 spring.ai.vertex.ai.embedding.text 是属性前缀,可让您为 VertexAI 文本嵌入配置嵌入模型实现。

The prefix spring.ai.vertex.ai.embedding.text is the property prefix that lets you configure the embedding model implementation for VertexAI Text Embedding.

Property Description Default

spring.ai.vertex.ai.embedding.text.enabled (Removed and no longer valid)

Enable Vertex AI Embedding API model.

true

spring.ai.model.embedding.text

Enable Vertex AI Embedding API model.

vertexai

spring.ai.vertex.ai.embedding.text.options.model

This is the Vertex Text Embedding model to use

text-embedding-004

spring.ai.vertex.ai.embedding.text.options.task-type

The intended downstream application to help the model produce better quality embeddings. Available task-types

RETRIEVAL_DOCUMENT

spring.ai.vertex.ai.embedding.text.options.title

Optional title, only valid with task_type=RETRIEVAL_DOCUMENT.

-

spring.ai.vertex.ai.embedding.text.options.dimensions

The number of dimensions the resulting output embeddings should have. Supported for model version 004 and later. You can use this parameter to reduce the embedding size, for example, for storage optimization.

-

spring.ai.vertex.ai.embedding.text.options.auto-truncate

When set to true, input text will be truncated. When set to false, an error is returned if the input text is longer than the maximum length supported by the model.

true

Sample Controller

Create 一个新的 Spring Boot 项目,并将 spring-ai-starter-model-vertex-ai-embedding 添加到您的 pom (或 gradle) 依赖项中。

Create a new Spring Boot project and add the spring-ai-starter-model-vertex-ai-embedding to your pom (or gradle) dependencies.

src/main/resources 目录下添加一个 application.properties 文件,以启用和配置 VertexAi 聊天模型:

Add a application.properties file, under the src/main/resources directory, to enable and configure the VertexAi chat model:

spring.ai.vertex.ai.embedding.project-id=<YOUR_PROJECT_ID>
spring.ai.vertex.ai.embedding.location=<YOUR_PROJECT_LOCATION>
spring.ai.vertex.ai.embedding.text.options.model=text-embedding-004

这将创建一个 VertexAiTextEmbeddingModel 实现,您可以将其注入到您的类中。以下是一个简单的 @Controller 类示例,它使用嵌入模型进行嵌入生成。

This will create a VertexAiTextEmbeddingModel implementation that you can inject into your class. Here is an example of a simple @Controller class that uses the embedding model for embeddings generations.

@RestController
public class EmbeddingController {

    private final EmbeddingModel embeddingModel;

    @Autowired
    public EmbeddingController(EmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }

    @GetMapping("/ai/embedding")
    public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
        return Map.of("embedding", embeddingResponse);
    }
}

Manual Configuration

VertexAiTextEmbeddingModel 实现了 EmbeddingModel

The VertexAiTextEmbeddingModel implements the EmbeddingModel.

将 ` spring-ai-vertex-ai-embedding ` 依赖项添加到您项目的 Maven ` pom.xml ` 文件中:

Add the spring-ai-vertex-ai-embedding dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-embedding</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-vertex-ai-embedding'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

接下来,创建一个 VertexAiTextEmbeddingModel 并将其用于文本生成:

Next, create a VertexAiTextEmbeddingModel and use it for text generations:

VertexAiEmbeddingConnectionDetails connectionDetails =
    VertexAiEmbeddingConnectionDetails.builder()
        .projectId(System.getenv(<VERTEX_AI_GEMINI_PROJECT_ID>))
        .location(System.getenv(<VERTEX_AI_GEMINI_LOCATION>))
        .build();

VertexAiTextEmbeddingOptions options = VertexAiTextEmbeddingOptions.builder()
    .model(VertexAiTextEmbeddingOptions.DEFAULT_MODEL_NAME)
    .build();

var embeddingModel = new VertexAiTextEmbeddingModel(this.connectionDetails, this.options);

EmbeddingResponse embeddingResponse = this.embeddingModel
	.embedForResponse(List.of("Hello World", "World is big and salvation is near"));

Load credentials from a Google Service Account

要以编程方式从服务帐户 json 文件加载 GoogleCredentials,您可以使用以下内容:

To programmatically load the GoogleCredentials from a Service Account json file, you can use the following:

GoogleCredentials credentials = GoogleCredentials.fromStream(<INPUT_STREAM_TO_CREDENTIALS_JSON>)
        .createScoped("https://www.googleapis.com/auth/cloud-platform");
credentials.refreshIfExpired();

VertexAiEmbeddingConnectionDetails connectionDetails =
    VertexAiEmbeddingConnectionDetails.builder()
        .projectId(System.getenv(<VERTEX_AI_GEMINI_PROJECT_ID>))
        .location(System.getenv(<VERTEX_AI_GEMINI_LOCATION>))
        .apiEndpoint(endpoint)
        .predictionServiceSettings(
            PredictionServiceSettings.newBuilder()
                .setEndpoint(endpoint)
                .setCredentialsProvider(FixedCredentialsProvider.create(credentials))
                .build());