Chat Model API

聊天模型API为开发者提供了将AI驱动的聊天补全功能集成到其应用程序中的能力。它利用预训练语言模型（如GPT，即生成式预训练Transformer）生成类似人类的自然语言响应用户输入。

The Chat Model API offers developers the ability to integrate AI-powered chat completion capabilities into their applications. It leverages pre-trained language models, such as GPT (Generative Pre-trained Transformer), to generate human-like responses to user inputs in natural language.

该 API 通常通过向 AI 模型发送提示或部分会话来工作，然后根据其训练数据和对自然语言模式的理解来生成对话的完成或延续。然后将完成的响应返回给应用程序，应用程序可以将其呈现给用户或用于进一步处理。

The API typically works by sending a prompt or partial conversation to the AI model, which then generates a completion or continuation of the conversation based on its training data and understanding of natural language patterns. The completed response is then returned to the application, which can present it to the user or use it for further processing.

Spring AI Chat Model API 旨在成为一个简单且可移植的接口，用于与各种 AI Models 交互，允许开发者以最少的代码更改在不同模型之间切换。此设计符合Spring的模块化和可互换性理念。

The Spring AI Chat Model API is designed to be a simple and portable interface for interacting with various AI Models, allowing developers to switch between different models with minimal code changes. This design aligns with Spring’s philosophy of modularity and interchangeability.

Also with the help of companion classes like Prompt for input encapsulation and ChatResponse for output handling, the Chat Model API unifies the communication with AI Models. It manages the complexity of request preparation and response parsing, offering a direct and simplified API interaction.

You can find more about available implementations in the Available Implementations section as well as detailed comparison in the Chat Models Comparison section.

API Overview

This section provides a guide to the Spring AI Chat Model API interface and associated classes.

ChatModel

这是 ChatModel 接口定义：

Here is the ChatModel interface definition:

public interface ChatModel extends Model<Prompt, ChatResponse> {

	default String call(String message) {...}

    @Override
	ChatResponse call(Prompt prompt);
}

使用带 String 参数的 call() 方法可以简化初始使用，避免更复杂的 Prompt 和 ChatResponse 类的复杂性。在实际应用中，更常用的是接受 Prompt 实例并返回 ChatResponse 的 call() 方法。

The call() method with a String parameter simplifies initial use, avoiding the complexities of the more sophisticated Prompt and ChatResponse classes. In real-world applications, it is more common to use the call() method that takes a Prompt instance and returns a ChatResponse.

StreamingChatModel

以下是 StreamingChatModel 接口定义：

Here is the StreamingChatModel interface definition:

public interface StreamingChatModel extends StreamingModel<Prompt, ChatResponse> {

    default Flux<String> stream(String message) {...}

    @Override
	Flux<ChatResponse> stream(Prompt prompt);
}

stream() 方法接受 String 或 Prompt 参数，类似于 ChatModel ，但它使用响应式 Flux API 流式传输响应。

The stream() method takes a String or Prompt parameter similar to ChatModel but it streams the responses using the reactive Flux API.

Prompt

Prompt 是一个 ModelRequest ，它封装了 Message 对象的列表和可选的模型请求选项。以下清单显示了 Prompt 类的截断版本，不包括构造函数和其他实用方法：

The Prompt is a ModelRequest that encapsulates a list of Message objects and optional model request options. The following listing shows a truncated version of the Prompt class, excluding constructors and other utility methods:

public class Prompt implements ModelRequest<List<Message>> {

    private final List<Message> messages;

    private ChatOptions modelOptions;

	@Override
	public ChatOptions getOptions() {...}

	@Override
	public List<Message> getInstructions() {...}

    // constructors and utility methods omitted
}

Message

Message 接口封装了 Prompt 文本内容、一组元数据属性以及称为 MessageType 的分类。

The Message interface encapsulates a Prompt textual content, a collection of metadata attributes, and a categorization known as MessageType.

接口定义如下：

The interface is defined as follows:

public interface Content {

	String getText();

	Map<String, Object> getMetadata();
}

public interface Message extends Content {

	MessageType getMessageType();
}

多模态消息类型也实现了 MediaContent 接口，提供 Media 内容对象的列表。

The multimodal message types implement also the MediaContent interface providing a list of Media content objects.

public interface MediaContent extends Content {

	Collection<Media> getMedia();

}

Message 接口有各种实现，对应于 AI 模型可以处理的消息类别：

The Message interface has various implementations that correspond to the categories of messages that an AI model can process:

聊天完成端点根据对话角色区分消息类别，通过 MessageType 有效映射。

The chat completion endpoint, distinguish between message categories based on conversational roles, effectively mapped by the MessageType.

例如，OpenAI 识别用于不同对话角色的消息类别，例如 system 、 user 、 function 或 assistant 。

For instance, OpenAI recognizes message categories for distinct conversational roles such as system, user, function, or assistant.

虽然术语 MessageType 可能暗示特定的消息格式，但在这种情况下，它实际上指定了消息在对话中所扮演的角色。

While the term MessageType might imply a specific message format, in this context it effectively designates the role a message plays in the dialogue.

对于不使用特定角色的 AI 模型，UserMessage 实现充当一个标准类别，通常表示用户生成的查询或指令。要理解实际应用以及 Prompt 和 Message 之间的关系，尤其是在这些角色或消息类别的上下文中，请参阅 Prompts 部分的详细说明。

For AI models that do not use specific roles, the UserMessage implementation acts as a standard category, typically representing user-generated inquiries or instructions. To understand the practical application and the relationship between Prompt and Message, especially in the context of these roles or message categories, see the detailed explanations in the Prompts section.

Chat Options

表示可以传递给 AI 模型的选项。ChatOptions 类是 ModelOptions 的子类，用于定义一些可传递给 AI 模型的可移植选项。ChatOptions 类定义如下：

Represents the options that can be passed to the AI model. The ChatOptions class is a subclass of ModelOptions and is used to define few portable options that can be passed to the AI model. The ChatOptions class is defined as follows:

public interface ChatOptions extends ModelOptions {

	String getModel();
	Float getFrequencyPenalty();
	Integer getMaxTokens();
	Float getPresencePenalty();
	List<String> getStopSequences();
	Float getTemperature();
	Integer getTopK();
	Float getTopP();
	ChatOptions copy();

}

此外，每个特定于模型的 ChatModel/StreamingChatModel 实现都可以有自己的选项，可以传递给 AI 模型。例如，OpenAI 聊天完成模型有自己的选项，如 logitBias 、 seed 和 user 。

Additionally, every model specific ChatModel/StreamingChatModel implementation can have its own options that can be passed to the AI model. For example, the OpenAI Chat Completion model has its own options like logitBias, seed, and user.

这是一个强大的功能，允许开发人员在启动应用程序时使用模型特定选项，然后使用 Prompt 请求在运行时覆盖它们。

This is a powerful feature that allows developers to use model-specific options when starting the application and then override them at runtime using the Prompt request.

Spring AI 提供了一个复杂的系统来配置和使用聊天模型。它允许在启动时设置默认配置，同时还提供了按请求覆盖这些设置的灵活性。这种方法使开发人员能够轻松地使用不同的 AI 模型并根据需要调整参数，所有这些都在 Spring AI 框架提供的一致接口中完成。

Spring AI provides a sophisticated system for configuring and using Chat Models. It allows for default configurations to be set at start-up, while also providing the flexibility to override these settings on a per-request basis. This approach enables developers to easily work with different AI models and adjust parameters as needed, all within a consistent interface provided by the Spring AI framework.

以下流程图说明了 Spring AI 如何处理聊天模型的配置和执行，结合了启动和运行时选项：

Following flow diagram illustrates how Spring AI handles the configuration and execution of Chat Models, combining start-up and runtime options:

启动配置 - ChatModel/StreamingChatModel 使用“启动”聊天选项进行初始化。这些选项在 ChatModel 初始化期间设置，旨在提供默认配置。
Start-up Configuration - The ChatModel/StreamingChatModel is initialized with "Start-Up" Chat Options. These options are set during the ChatModel initialization and are meant to provide default configurations.
运行时配置 - 对于每个请求，Prompt 可以包含运行时聊天选项：这些可以覆盖启动选项。
Runtime Configuration - For each request, the Prompt can contain a Runtime Chat Options: These can override the start-up options.
选项合并过程 - “合并选项”步骤结合了启动和运行时选项。如果提供了运行时选项，它们将优先于启动选项。
Option Merging Process - The "Merge Options" step combines the start-up and runtime options. If runtime options are provided, they take precedence over the start-up options.
输入处理 - “转换输入”步骤将输入指令转换为本机、模型特定的格式。
Input Processing - The "Convert Input" step transforms the input instructions into native, model-specific formats.
输出处理 - “转换输出”步骤将模型的响应转换为标准化的 ChatResponse 格式。
Output Processing - The "Convert Output" step transforms the model’s response into a standardized ChatResponse format.

启动选项和运行时选项的分离允许全局配置和请求特定的调整。

The separation of start-up and runtime options allows for both global configurations and request-specific adjustments.

ChatResponse

ChatResponse 类的结构如下：

The structure of the ChatResponse class is as follows:

public class ChatResponse implements ModelResponse<Generation> {

    private final ChatResponseMetadata chatResponseMetadata;
	private final List<Generation> generations;

	@Override
	public ChatResponseMetadata getMetadata() {...}

    @Override
	public List<Generation> getResults() {...}

    // other methods omitted
}

ChatResponse 类保存AI模型的输出，每个 Generation 实例包含单个提示可能产生的多个输出之一。

The ChatResponse class holds the AI Model’s output, with each Generation instance containing one of potentially multiple outputs resulting from a single prompt.

ChatResponse 类还包含关于人工智能模型响应的 ChatResponseMetadata 元数据。

The ChatResponse class also carries a ChatResponseMetadata metadata about the AI Model’s response.

Generation

最后， Generation 类扩展自 ModelResult 类，用于表示模型输出（助手消息）和相关元数据：

Finally, the Generation class extends from the ModelResult to represent the model output (assistant message) and related metadata:

public class Generation implements ModelResult<AssistantMessage> {

	private final AssistantMessage assistantMessage;
	private ChatGenerationMetadata chatGenerationMetadata;

	@Override
	public AssistantMessage getOutput() {...}

	@Override
	public ChatGenerationMetadata getMetadata() {...}

    // other methods omitted
}

Available Implementations

此图展示了统一接口 ChatModel 和 StreamingChatModel ，用于与来自不同提供商的各种 AI 聊天模型进行交互，从而允许轻松集成和切换不同的 AI 服务，同时为客户端应用程序保持一致的 API。

This diagram illustrates the unified interfaces, ChatModel and StreamingChatModel, are used for interacting with various AI chat models from different providers, allowing easy integration and switching between different AI services while maintaining a consistent API for the client application.

OpenAI Chat Completion （支持流式传输、多模态和函数调用）
OpenAI Chat Completion (streaming, multi-modality & function-calling support)
Microsoft Azure Open AI Chat Completion （支持流式传输和函数调用）
Microsoft Azure Open AI Chat Completion (streaming & function-calling support)
Ollama Chat Completion （支持流式传输、多模态和函数调用）
Ollama Chat Completion (streaming, multi-modality & function-calling support)
Hugging Face Chat Completion （不支持流式传输）
Hugging Face Chat Completion (no streaming support)
Google Vertex AI Gemini Chat Completion （支持流式传输、多模态和函数调用）
Google Vertex AI Gemini Chat Completion (streaming, multi-modality & function-calling support)
Amazon Bedrock
Mistral AI Chat Completion （支持流式传输和函数调用）
Mistral AI Chat Completion (streaming & function-calling support)
Anthropic Chat Completion （支持流式传输和函数调用）
Anthropic Chat Completion (streaming & function-calling support)

有关可用聊天模型的详细比较，请参见 Chat Models Comparison 部分。

Find a detailed comparison of the available Chat Models in the Chat Models Comparison section.

Chat Model API

Spring AI 聊天模型 API 构建在 Spring AI Generic Model API 之上，提供聊天特定的抽象和实现。这允许轻松集成和切换不同的 AI 服务，同时为客户端应用程序保持一致的 API。以下类图展示了 Spring AI 聊天模型 API 的主要类和接口。

The Spring AI Chat Model API is built on top of the Spring AI Generic Model API providing Chat specific abstractions and implementations. This allows an easy integration and switching between different AI services while maintaining a consistent API for the client application. The following class diagram illustrates the main classes and interfaces of the Spring AI Chat Model API.