Google VertexAI Text Embeddings
Vertex AI 支持两种类型的嵌入模型,文本和多模态。本文档描述了如何使用 Vertex AI Text embeddings API 创建文本嵌入。
Vertex AI supports two types of embeddings models, text and multimodal. This document describes how to create a text embedding using the Vertex AI Text embeddings API.
Vertex AI 文本嵌入 API 使用密集向量表示。与倾向于将单词直接映射到数字的稀疏向量不同,密集向量旨在更好地表示一段文本的含义。在生成式 AI 中使用密集向量嵌入的好处是,您无需搜索直接的单词或语法匹配,而是可以更好地搜索与查询含义对齐的段落,即使这些段落不使用相同的语言。
Vertex AI text embeddings API uses dense vector representations. Unlike sparse vectors, which tend to directly map words to numbers, dense vectors are designed to better represent the meaning of a piece of text. The benefit of using dense vector embeddings in generative AI is that instead of searching for direct word or syntax matches, you can better search for passages that align to the meaning of the query, even if the passages don’t use the same language.
Prerequisites
-
以下是使用Gemini将该文本翻译成中文的结果:安装适用于您操作系统的 gcloud CLI。
-
Install the gcloud CLI, appropriate for you OS.
-
请运行以下命令进行身份验证。将
PROJECT_ID
替换为您的 Google Cloud 项目 ID,将ACCOUNT
替换为您的 Google Cloud 用户名。 -
Authenticate by running the following command. Replace
PROJECT_ID
with your Google Cloud project ID andACCOUNT
with your Google Cloud username.
gcloud config set project <PROJECT_ID> &&
gcloud auth application-default login <ACCOUNT>
Add Repositories and BOM
Spring AI 工件发布在 Maven Central 和 Spring Snapshot 存储库中。请参阅“添加 Spring AI 仓库”部分,将这些仓库添加到您的构建系统。
Spring AI artifacts are published in Maven Central and Spring Snapshot repositories. Refer to the Artifact Repositories section to add these repositories to your build system.
为了帮助进行依赖项管理,Spring AI 提供了一个 BOM(物料清单)以确保在整个项目中使用一致版本的 Spring AI。有关将 Spring AI BOM 添加到你的构建系统的说明,请参阅 Dependency Management 部分。
To help with dependency management, Spring AI provides a BOM (bill of materials) to ensure that a consistent version of Spring AI is used throughout the entire project. Refer to the Dependency Management section to add the Spring AI BOM to your build system.
Auto-configuration
Spring AI 自动配置、启动器模块的工件名称发生了重大变化。请参阅 upgrade notes 以获取更多信息。 There has been a significant change in the Spring AI auto-configuration, starter modules' artifact names. Please refer to the upgrade notes for more information. |
Spring AI 为 VertexAI 嵌入模型提供了 Spring Boot 自动配置。要启用它,请将以下依赖项添加到您的项目的 Maven pom.xml
文件中:
Spring AI provides Spring Boot auto-configuration for the VertexAI Embedding Model.
To enable it add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-starter-model-vertex-ai-embedding'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Embedding Properties
前缀 spring.ai.vertex.ai.embedding
用作属性前缀,允许您连接到 VertexAI 嵌入 API。
The prefix spring.ai.vertex.ai.embedding
is used as the property prefix that lets you connect to VertexAI Embedding API.
Property | Description | Default |
---|---|---|
spring.ai.vertex.ai.embedding.project-id |
Google Cloud Platform project ID |
- |
spring.ai.vertex.ai.embedding.location |
Region |
- |
spring.ai.vertex.ai.embedding.apiEndpoint |
Vertex AI Embedding API endpoint. |
- |
嵌入自动配置的启用和禁用现在通过前缀为 Enabling and disabling of the embedding auto-configurations are now configured via top level properties with the prefix 要启用,spring.ai.model.embedding.text=vertexai(默认启用) To enable, spring.ai.model.embedding.text=vertexai (It is enabled by default) 要禁用,spring.ai.model.embedding.text=none(或任何与 vertexai 不匹配的值) To disable, spring.ai.model.embedding.text=none (or any value which doesn’t match vertexai) 此更改旨在允许配置多个模型。 This change is done to allow configuration of multiple models. |
前缀 spring.ai.vertex.ai.embedding.text
是属性前缀,可让您为 VertexAI 文本嵌入配置嵌入模型实现。
The prefix spring.ai.vertex.ai.embedding.text
is the property prefix that lets you configure the embedding model implementation for VertexAI Text Embedding.
Property | Description | Default |
---|---|---|
spring.ai.vertex.ai.embedding.text.enabled (Removed and no longer valid) |
Enable Vertex AI Embedding API model. |
true |
spring.ai.model.embedding.text |
Enable Vertex AI Embedding API model. |
vertexai |
spring.ai.vertex.ai.embedding.text.options.model |
This is the Vertex Text Embedding model to use |
text-embedding-004 |
spring.ai.vertex.ai.embedding.text.options.task-type |
The intended downstream application to help the model produce better quality embeddings. Available task-types |
|
spring.ai.vertex.ai.embedding.text.options.title |
Optional title, only valid with task_type=RETRIEVAL_DOCUMENT. |
- |
spring.ai.vertex.ai.embedding.text.options.dimensions |
The number of dimensions the resulting output embeddings should have. Supported for model version 004 and later. You can use this parameter to reduce the embedding size, for example, for storage optimization. |
- |
spring.ai.vertex.ai.embedding.text.options.auto-truncate |
When set to true, input text will be truncated. When set to false, an error is returned if the input text is longer than the maximum length supported by the model. |
true |
Sample Controller
Create 一个新的 Spring Boot 项目,并将 spring-ai-starter-model-vertex-ai-embedding
添加到您的 pom (或 gradle) 依赖项中。
Create a new Spring Boot project and add the spring-ai-starter-model-vertex-ai-embedding
to your pom (or gradle) dependencies.
在 src/main/resources
目录下添加一个 application.properties
文件,以启用和配置 VertexAi 聊天模型:
Add a application.properties
file, under the src/main/resources
directory, to enable and configure the VertexAi chat model:
spring.ai.vertex.ai.embedding.project-id=<YOUR_PROJECT_ID>
spring.ai.vertex.ai.embedding.location=<YOUR_PROJECT_LOCATION>
spring.ai.vertex.ai.embedding.text.options.model=text-embedding-004
这将创建一个 VertexAiTextEmbeddingModel
实现,您可以将其注入到您的类中。以下是一个简单的 @Controller
类示例,它使用嵌入模型进行嵌入生成。
This will create a VertexAiTextEmbeddingModel
implementation that you can inject into your class.
Here is an example of a simple @Controller
class that uses the embedding model for embeddings generations.
@RestController
public class EmbeddingController {
private final EmbeddingModel embeddingModel;
@Autowired
public EmbeddingController(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@GetMapping("/ai/embedding")
public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
EmbeddingResponse embeddingResponse = this.embeddingModel.embedForResponse(List.of(message));
return Map.of("embedding", embeddingResponse);
}
}
Manual Configuration
VertexAiTextEmbeddingModel 实现了 EmbeddingModel
。
The VertexAiTextEmbeddingModel implements the EmbeddingModel
.
将 ` spring-ai-vertex-ai-embedding
` 依赖项添加到您项目的 Maven ` pom.xml
` 文件中:
Add the spring-ai-vertex-ai-embedding
dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-vertex-ai-embedding</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-vertex-ai-embedding'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
接下来,创建一个 VertexAiTextEmbeddingModel
并将其用于文本生成:
Next, create a VertexAiTextEmbeddingModel
and use it for text generations:
VertexAiEmbeddingConnectionDetails connectionDetails =
VertexAiEmbeddingConnectionDetails.builder()
.projectId(System.getenv(<VERTEX_AI_GEMINI_PROJECT_ID>))
.location(System.getenv(<VERTEX_AI_GEMINI_LOCATION>))
.build();
VertexAiTextEmbeddingOptions options = VertexAiTextEmbeddingOptions.builder()
.model(VertexAiTextEmbeddingOptions.DEFAULT_MODEL_NAME)
.build();
var embeddingModel = new VertexAiTextEmbeddingModel(this.connectionDetails, this.options);
EmbeddingResponse embeddingResponse = this.embeddingModel
.embedForResponse(List.of("Hello World", "World is big and salvation is near"));
Load credentials from a Google Service Account
要以编程方式从服务帐户 json 文件加载 GoogleCredentials,您可以使用以下内容:
To programmatically load the GoogleCredentials from a Service Account json file, you can use the following:
GoogleCredentials credentials = GoogleCredentials.fromStream(<INPUT_STREAM_TO_CREDENTIALS_JSON>)
.createScoped("https://www.googleapis.com/auth/cloud-platform");
credentials.refreshIfExpired();
VertexAiEmbeddingConnectionDetails connectionDetails =
VertexAiEmbeddingConnectionDetails.builder()
.projectId(System.getenv(<VERTEX_AI_GEMINI_PROJECT_ID>))
.location(System.getenv(<VERTEX_AI_GEMINI_LOCATION>))
.apiEndpoint(endpoint)
.predictionServiceSettings(
PredictionServiceSettings.newBuilder()
.setEndpoint(endpoint)
.setCredentialsProvider(FixedCredentialsProvider.create(credentials))
.build());