Google VertexAI Multimodal Embeddings

实验性。仅用于实验目的。尚不兼容 VectorStores

EXPERIMENTAL. Used for experimental purposes only. Not compatible yet with the VectorStores.

Vertex AI 支持两种类型的嵌入模型,文本和多模态。本文档描述了如何使用 Vertex AI Multimodal embeddings API 创建多模态嵌入。

Vertex AI supports two types of embeddings models, text and multimodal. This document describes how to create a multimodal embedding using the Vertex AI Multimodal embeddings API.

多模态嵌入模型根据你提供的输入生成 1408 维向量,输入可以包括图像、文本和视频数据的组合。然后,嵌入向量可用于后续任务,如图像分类或视频内容审核。

The multimodal embeddings model generates 1408-dimension vectors based on the input you provide, which can include a combination of image, text, and video data. The embedding vectors can then be used for subsequent tasks like image classification or video content moderation.

图像嵌入向量和文本嵌入向量在相同的语义空间中具有相同的维度。因此,这些向量可以互换使用,用于按文本搜索图像或按图像搜索视频等用例。

The image embedding vector and text embedding vector are in the same semantic space with the same dimensionality. Consequently, these vectors can be used interchangeably for use cases like searching image by text, or searching video by image.

VertexAI 多模态 API 施加了 following limits

The VertexAI Multimodal API imposes the following limits.

对于仅文本嵌入用例,我们建议使用 Vertex AI text-embeddings model

For text-only embedding use cases, we recommend using the Vertex AI text-embeddings model instead.

Prerequisites

  • 以下是使用Gemini将该文本翻译成中文的结果:安装适用于您操作系统的 gcloud CLI。

  • Install the gcloud CLI, appropriate for you OS.

  • 请运行以下命令进行身份验证。将 PROJECT_ID 替换为您的 Google Cloud 项目 ID,将 ACCOUNT 替换为您的 Google Cloud 用户名。

  • Authenticate by running the following command. Replace PROJECT_ID with your Google Cloud project ID and ACCOUNT with your Google Cloud username.

gcloud config set project <PROJECT_ID> &&
gcloud auth application-default login <ACCOUNT>

Add Repositories and BOM

Spring AI 工件发布在 Maven Central 和 Spring Snapshot 存储库中。请参阅“添加 Spring AI 仓库”部分,将这些仓库添加到您的构建系统。

Spring AI artifacts are published in Maven Central and Spring Snapshot repositories. Refer to the Artifact Repositories section to add these repositories to your build system.

为了帮助进行依赖项管理,Spring AI 提供了一个 BOM(物料清单)以确保在整个项目中使用一致版本的 Spring AI。有关将 Spring AI BOM 添加到你的构建系统的说明,请参阅 Dependency Management 部分。

To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.

Auto-configuration

Spring AI 自动配置、启动器模块的工件名称发生了重大变化。请参阅 upgrade notes 以获取更多信息。

There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names. Please refer to the upgrade notes for more information.

Spring AI 为 VertexAI 嵌入模型提供了 Spring Boot 自动配置。要启用它,请将以下依赖项添加到您的项目的 Maven pom.xml 文件中:

Spring AI provides Spring Boot auto-configuration for the VertexAI Embedding Model. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-starter-model-vertex-ai-embedding'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Embedding Properties

前缀 spring.ai.vertex.ai.embedding 用作属性前缀,允许您连接到 VertexAI 嵌入 API。

The prefix spring.ai.vertex.ai.embedding is used as the property prefix that lets you connect to VertexAI Embedding API.

Property Description Default

spring.ai.vertex.ai.embedding.project-id

Google Cloud Platform project ID

-

spring.ai.vertex.ai.embedding.location

Region

-

spring.ai.vertex.ai.embedding.apiEndpoint

Vertex AI Embedding API endpoint.

-

嵌入自动配置的启用和禁用现在通过前缀为 spring.ai.azure.openai.embedding 的顶级属性进行配置。

Enabling and disabling of the embedding auto-configurations are now configured via top level properties with the prefix spring.ai.model.embedding.

要启用,请设置 spring.ai.model.embedding.multimodal=vertexai(默认已启用)。

To enable, spring.ai.model.embedding.multimodal=vertexai (It is enabled by default)

要禁用,请设置 spring.ai.model.embedding.multimodal=none(或任何不匹配 vertexai 的值)。

To disable, spring.ai.model.embedding.multimodal=none (or any value which doesn’t match vertexai)

此更改旨在允许配置多个模型。

This change is done to allow configuration of multiple models.

前缀 ` spring.ai.vertex.ai.embedding.multimodal ` 是属性前缀,它允许您配置 VertexAI 多模态嵌入的嵌入模型实现。

The prefix spring.ai.vertex.ai.embedding.multimodal is the property prefix that lets you configure the embedding model implementation for VertexAI Multimodal Embedding.

Property Description Default

spring.ai.vertex.ai.embedding.multimodal.enabled (Removed and no longer valid)

Enable Vertex AI Embedding API model.

true

spring.ai.model.embedding.multimodal=vertexai

Enable Vertex AI Embedding API model.

vertexai

spring.ai.vertex.ai.embedding.multimodal.options.model

You can get multimodal embeddings by using the following model:

multimodalembedding@001

spring.ai.vertex.ai.embedding.multimodal.options.dimensions

Specify lower-dimension embeddings. By default, an embedding request returns a 1408 float vector for a data type. You can also specify lower-dimension embeddings (128, 256, or 512 float vectors) for text and image data.

1408

spring.ai.vertex.ai.embedding.multimodal.options.video-start-offset-sec

The start offset of the video segment in seconds. If not specified, it’s calculated with max(0, endOffsetSec - 120).

-

spring.ai.vertex.ai.embedding.multimodal.options.video-end-offset-sec

The end offset of the video segment in seconds. If not specified, it’s calculated with min(video length, startOffSec + 120). If both startOffSec and endOffSec are specified, endOffsetSec is adjusted to min(startOffsetSec+120, endOffsetSec).

-

spring.ai.vertex.ai.embedding.multimodal.options.video-interval-sec

The interval of the video the embedding will be generated. The minimum value for interval_sec is 4. If the interval is less than 4, an InvalidArgumentError is returned. There are no limitations on the maximum value of the interval. However, if the interval is larger than min(video length, 120s), it impacts the quality of the generated embeddings. Default value: 16.

-

Manual Configuration

` VertexAiMultimodalEmbeddingModel ` 实现了 ` DocumentEmbeddingModel `。

The VertexAiMultimodalEmbeddingModel implements the DocumentEmbeddingModel.

将 ` spring-ai-vertex-ai-embedding ` 依赖项添加到您项目的 Maven ` pom.xml ` 文件中:

Add the spring-ai-vertex-ai-embedding dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-embedding</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-vertex-ai-embedding'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

接下来,创建一个 ` VertexAiMultimodalEmbeddingModel ` 并将其用于嵌入生成:

Next, create a VertexAiMultimodalEmbeddingModel and use it for embeddings generations:

VertexAiEmbeddingConnectionDetails connectionDetails =
    VertexAiEmbeddingConnectionDetails.builder()
        .projectId(System.getenv(<VERTEX_AI_GEMINI_PROJECT_ID>))
        .location(System.getenv(<VERTEX_AI_GEMINI_LOCATION>))
        .build();

VertexAiMultimodalEmbeddingOptions options = VertexAiMultimodalEmbeddingOptions.builder()
    .model(VertexAiMultimodalEmbeddingOptions.DEFAULT_MODEL_NAME)
    .build();

var embeddingModel = new VertexAiMultimodalEmbeddingModel(this.connectionDetails, this.options);

Media imageMedial = new Media(MimeTypeUtils.IMAGE_PNG, new ClassPathResource("/test.image.png"));
Media videoMedial = new Media(new MimeType("video", "mp4"), new ClassPathResource("/test.video.mp4"));

var document = new Document("Explain what do you see on this video?", List.of(this.imageMedial, this.videoMedial), Map.of());

EmbeddingResponse embeddingResponse = this.embeddingModel
	.embedForResponse(List.of("Hello World", "World is big and salvation is near"));

DocumentEmbeddingRequest embeddingRequest = new DocumentEmbeddingRequest(List.of(this.document),
        EmbeddingOptions.EMPTY);

EmbeddingResponse embeddingResponse = multiModelEmbeddingModel.call(this.embeddingRequest);

assertThat(embeddingResponse.getResults()).hasSize(3);