Chat Client API

The ChatClient 提供了一个流畅的 API,用于与 AI 模型进行通信。它支持同步和流式编程模型。

The ChatClient offers a fluent API for communicating with an AI Model. It supports both a synchronous and streaming programming model.

请参阅本文档底部的 Implementation Notes ,了解在 ChatClient 中命令式和响应式编程模型的组合使用。

See the Implementation Notes at the bottom of this document related to the combined use of imperative and reactive programming models in ChatClient

流畅的 API 具有用于构建 Prompt 的组成部分的方法,该 Prompt 作为输入传递给 AI 模型。 Prompt 包含指导 AI 模型输出和行为的指令文本。从 API 的角度来看,提示由消息集合组成。

The fluent API has methods for building up the constituent parts of a Prompt that is passed to the AI model as input. The Prompt contains the instructional text to guide the AI model’s output and behavior. From the API point of view, prompts consist of a collection of messages.

AI 模型处理两种主要类型的消息:用户消息,即来自用户的直接输入;以及系统消息,即系统生成以指导对话的消息。

The AI model processes two main types of messages: user messages, which are direct inputs from the user, and system messages, which are generated by the system to guide the conversation.

这些消息通常包含占位符,这些占位符在运行时根据用户输入进行替换,以自定义 AI 模型对用户输入的响应。

These messages often contain placeholders that are substituted at runtime based on user input to customize the response of the AI model to the user input.

还可以指定提示选项,例如要使用的 AI 模型的名称以及控制生成输出的随机性或创造性的温度设置。

There are also Prompt options that can be specified, such as the name of the AI Model to use and the temperature setting that controls the randomness or creativity of the generated output.

Creating a ChatClient

ChatClient 是使用 ChatClient.Builder 对象创建的。您可以为任何 ChatModel Spring Boot 自动配置获取一个自动配置的 ChatClient.Builder 实例,或者以编程方式创建一个。

The ChatClient is created using a ChatClient.Builder object. You can obtain an autoconfigured ChatClient.Builder instance for any ChatModel Spring Boot autoconfiguration or create one programmatically.

Using an autoconfigured ChatClient.Builder

在最简单的用例中,Spring AI 提供 Spring Boot 自动配置,为您创建一个原型 ChatClient.Builder bean,以便注入到您的类中。这是一个检索对简单用户请求的 String 响应的简单示例。

In the most simple use case, Spring AI provides Spring Boot autoconfiguration, creating a prototype ChatClient.Builder bean for you to inject into your class. Here is a simple example of retrieving a String response to a simple user request.

@RestController
class MyController {

    private final ChatClient chatClient;

    public MyController(ChatClient.Builder chatClientBuilder) {
        this.chatClient = chatClientBuilder.build();
    }

    @GetMapping("/ai")
    String generation(String userInput) {
        return this.chatClient.prompt()
            .user(userInput)
            .call()
            .content();
    }
}

在这个简单的示例中,用户输入设置了用户消息的内容。 call() 方法向 AI 模型发送请求, content() 方法将 AI 模型的响应作为 String 返回。

In this simple example, the user input sets the contents of the user message. The call() method sends a request to the AI model, and the content() method returns the AI model’s response as a String.

Working with Multiple Chat Models

有几种场景您可能需要在单个应用程序中使用多个聊天模型:

There are several scenarios where you might need to work with multiple chat models in a single application:

  • * 为不同类型的任务使用不同的模型(例如,用于复杂推理的强大模型和用于简单任务的更快、更便宜的模型)

  • Using different models for different types of tasks (e.g., a powerful model for complex reasoning and a faster, cheaper model for simpler tasks)

  • * 当一个模型服务不可用时,实现回退机制

  • Implementing fallback mechanisms when one model service is unavailable

  • * 对不同的模型或配置进行 A/B 测试

  • A/B testing different models or configurations

  • * 根据用户的偏好为用户提供模型选择

  • Providing users with a choice of models based on their preferences

  • * 组合专用模型(一个用于代码生成,另一个用于创意内容等)

  • Combining specialized models (one for code generation, another for creative content, etc.)

默认情况下,Spring AI 自动配置单个 ChatClient.Builder bean。但是,您可能需要在应用程序中使用多个聊天模型。以下是处理此场景的方法:

By default, Spring AI autoconfigures a single ChatClient.Builder bean. However, you may need to work with multiple chat models in your application. Here’s how to handle this scenario:

在所有情况下,您都需要通过设置属性 spring.ai.chat.client.enabled=false 来禁用 ChatClient.Builder 自动配置。

In all cases, you need to disable the ChatClient.Builder autoconfiguration by setting the property spring.ai.chat.client.enabled=false.

这允许您手动创建多个 ChatClient 实例。

This allows you to create multiple ChatClient instances manually.

Multiple ChatClients with a Single Model Type

本节介绍了一个常见的用例,即需要创建多个 ChatClient 实例,这些实例都使用相同的底层模型类型,但配置不同。

This section covers a common use case where you need to create multiple ChatClient instances that all use the same underlying model type but with different configurations.

// Create ChatClient instances programmatically
ChatModel myChatModel = ... // already autoconfigured by Spring Boot
ChatClient chatClient = ChatClient.create(myChatModel);

// Or use the builder for more control
ChatClient.Builder builder = ChatClient.builder(myChatModel);
ChatClient customChatClient = builder
    .defaultSystemPrompt("You are a helpful assistant.")
    .build();

ChatClients for Different Model Types

当使用多个 AI 模型时,您可以为每个模型定义单独的 ChatClient bean:

When working with multiple AI models, you can define separate ChatClient beans for each model:

import org.springframework.ai.chat.ChatClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ChatClientConfig {

    @Bean
    public ChatClient openAiChatClient(OpenAiChatModel chatModel) {
        return ChatClient.create(chatModel);
    }

    @Bean
    public ChatClient anthropicChatClient(AnthropicChatModel chatModel) {
        return ChatClient.create(chatModel);
    }
}

然后,您可以使用 @Qualifier 注解将这些 bean 注入到应用程序组件中:

You can then inject these beans into your application components using the @Qualifier annotation:

@Configuration
public class ChatClientExample {

    @Bean
    CommandLineRunner cli(
            @Qualifier("openAiChatClient") ChatClient openAiChatClient,
            @Qualifier("anthropicChatClient") ChatClient anthropicChatClient) {

        return args -> {
            var scanner = new Scanner(System.in);
            ChatClient chat;

            // Model selection
            System.out.println("\nSelect your AI model:");
            System.out.println("1. OpenAI");
            System.out.println("2. Anthropic");
            System.out.print("Enter your choice (1 or 2): ");

            String choice = scanner.nextLine().trim();

            if (choice.equals("1")) {
                chat = openAiChatClient;
                System.out.println("Using OpenAI model");
            } else {
                chat = anthropicChatClient;
                System.out.println("Using Anthropic model");
            }

            // Use the selected chat client
            System.out.print("\nEnter your question: ");
            String input = scanner.nextLine();
            String response = chat.prompt(input).call().content();
            System.out.println("ASSISTANT: " + response);

            scanner.close();
        };
    }
}

Multiple OpenAI-Compatible API Endpoints

OpenAiApiOpenAiChatModel 类提供了一个 mutate() 方法,允许您创建具有不同属性的现有实例的变体。当需要使用多个兼容 OpenAI 的 API 时,这尤其有用。

The OpenAiApi and OpenAiChatModel classes provide a mutate() method that allows you to create variations of existing instances with different properties. This is particularly useful when you need to work with multiple OpenAI-compatible APIs.

@Service
public class MultiModelService {

    private static final Logger logger = LoggerFactory.getLogger(MultiModelService.class);

    @Autowired
    private OpenAiChatModel baseChatModel;

    @Autowired
    private OpenAiApi baseOpenAiApi;

    public void multiClientFlow() {
        try {
            // Derive a new OpenAiApi for Groq (Llama3)
            OpenAiApi groqApi = baseOpenAiApi.mutate()
                .baseUrl("https://api.groq.com/openai")
                .apiKey(System.getenv("GROQ_API_KEY"))
                .build();

            // Derive a new OpenAiApi for OpenAI GPT-4
            OpenAiApi gpt4Api = baseOpenAiApi.mutate()
                .baseUrl("https://api.openai.com")
                .apiKey(System.getenv("OPENAI_API_KEY"))
                .build();

            // Derive a new OpenAiChatModel for Groq
            OpenAiChatModel groqModel = baseChatModel.mutate()
                .openAiApi(groqApi)
                .defaultOptions(OpenAiChatOptions.builder().model("llama3-70b-8192").temperature(0.5).build())
                .build();

            // Derive a new OpenAiChatModel for GPT-4
            OpenAiChatModel gpt4Model = baseChatModel.mutate()
                .openAiApi(gpt4Api)
                .defaultOptions(OpenAiChatOptions.builder().model("gpt-4").temperature(0.7).build())
                .build();

            // Simple prompt for both models
            String prompt = "What is the capital of France?";

            String groqResponse = ChatClient.builder(groqModel).build().prompt(prompt).call().content();
            String gpt4Response = ChatClient.builder(gpt4Model).build().prompt(prompt).call().content();

            logger.info("Groq (Llama3) response: {}", groqResponse);
            logger.info("OpenAI GPT-4 response: {}", gpt4Response);
        }
        catch (Exception e) {
            logger.error("Error in multi-client flow", e);
        }
    }
}

ChatClient Fluent API

ChatClient 流式 API 允许您使用重载的 prompt 方法以三种不同的方式创建提示,以启动流式 API:

The ChatClient fluent API allows you to create a prompt in three distinct ways using an overloaded prompt method to initiate the fluent API:

  • prompt() :此方法没有参数,允许您开始使用流式 API,从而可以构建提示的用户、系统和其他部分。

  • prompt(): This method with no arguments lets you start using the fluent API, allowing you to build up user, system, and other parts of the prompt.

  • prompt(Prompt prompt) :此方法接受一个 Prompt 参数,允许您传入一个使用提示的非流式 API 创建的 Prompt 实例。

  • prompt(Prompt prompt): This method accepts a Prompt argument, letting you pass in a Prompt instance that you have created using the Prompt’s non-fluent APIs.

  • prompt(String content) :这是一个方便的方法,类似于前面的重载。它接受用户的文本内容。

  • prompt(String content): This is a convenience method similar to the previous overload. It takes the user’s text content.

ChatClient Responses

ChatClient API 提供了几种使用流式 API 格式化 AI 模型响应的方法。

The ChatClient API offers several ways to format the response from the AI Model using the fluent API.

Returning a ChatResponse

AI 模型的响应是一种丰富的结构,由类型 ChatResponse 定义。它包括有关响应生成方式的元数据,还可以包含多个响应,称为 Generation ,每个响应都有自己的元数据。元数据包括用于创建响应的令牌数量(每个令牌大约是 3/4 个单词)。此信息很重要,因为托管 AI 模型根据每个请求使用的令牌数量收费。

The response from the AI model is a rich structure defined by the type ChatResponse. It includes metadata about how the response was generated and can also contain multiple responses, known as Generations, each with its own metadata. The metadata includes the number of tokens (each token is approximately 3/4 of a word) used to create the response. This information is important because hosted AI models charge based on the number of tokens used per request.

通过在 call() 方法之后调用 chatResponse() ,下面显示了一个返回包含元数据的 ChatResponse 对象的示例。

An example to return the ChatResponse object that contains the metadata is shown below by invoking chatResponse() after the call() method.

ChatResponse chatResponse = chatClient.prompt()
    .user("Tell me a joke")
    .call()
    .chatResponse();

Returning an Entity

您通常希望返回一个从返回的 String 映射的实体类。 entity() 方法提供了此功能。

You often want to return an entity class that is mapped from the returned String. The entity() method provides this functionality.

例如,给定 Java 记录:

For example, given the Java record:

record ActorFilms(String actor, List<String> movies) {}

您可以使用 entity() 方法轻松地将 AI 模型的输出映射到此记录,如下所示:

You can easily map the AI model’s output to this record using the entity() method, as shown below:

ActorFilms actorFilms = chatClient.prompt()
    .user("Generate the filmography for a random actor.")
    .call()
    .entity(ActorFilms.class);

还有一个重载的 entity 方法,其签名是 entity(ParameterizedTypeReference<T> type) ,允许您指定泛型列表等类型:

There is also an overloaded entity method with the signature entity(ParameterizedTypeReference<T> type) that lets you specify types such as generic Lists:

List<ActorFilms> actorFilms = chatClient.prompt()
    .user("Generate the filmography of 5 movies for Tom Hanks and Bill Murray.")
    .call()
    .entity(new ParameterizedTypeReference<List<ActorFilms>>() {});

Streaming Responses

stream() 方法允许您获取异步响应,如下所示:

The stream() method lets you get an asynchronous response as shown below:

Flux<String> output = chatClient.prompt()
    .user("Tell me a joke")
    .stream()
    .content();

您还可以使用 Flux<ChatResponse> chatResponse() 方法流式传输 ChatResponse

You can also stream the ChatResponse using the method Flux<ChatResponse> chatResponse().

将来,我们将提供一个便利方法,允许您使用响应式 stream() 方法返回 Java 实体。在此期间,您应该使用 Structured Output Converter 显式转换聚合响应,如下所示。这还演示了在流式 API 中使用参数,这将在文档的后面部分更详细地讨论。

In the future, we will offer a convenience method that will let you return a Java entity with the reactive stream() method. In the meantime, you should use the Structured Output Converter to convert the aggregated response explicity as shown below. This also demonstrates the use of parameters in the fluent API that will be discussed in more detail in a later section of the documentation.

var converter = new BeanOutputConverter<>(new ParameterizedTypeReference<List<ActorsFilms>>() {});

Flux<String> flux = this.chatClient.prompt()
    .user(u -> u.text("""
                        Generate the filmography for a random actor.
                        {format}
                      """)
            .param("format", this.converter.getFormat()))
    .stream()
    .content();

String content = this.flux.collectList().block().stream().collect(Collectors.joining());

List<ActorFilms> actorFilms = this.converter.convert(this.content);

Prompt Templates

ChatClient 流式 API 允许您将用户和系统文本作为模板提供,其中包含在运行时替换的变量。

The ChatClient fluent API lets you provide user and system text as templates with variables that are replaced at runtime.

String answer = ChatClient.create(chatModel).prompt()
    .user(u -> u
            .text("Tell me the names of 5 movies whose soundtrack was composed by {composer}")
            .param("composer", "John Williams"))
    .call()
    .content();

在内部,ChatClient 使用 PromptTemplate 类来处理用户和系统文本,并使用在运行时提供的与给定 TemplateRenderer 实现相关的变量值替换变量。默认情况下,Spring AI 使用 StTemplateRenderer 实现,该实现基于由 Terence Parr 开发的开源 StringTemplate 引擎。

Internally, the ChatClient uses the PromptTemplate class to handle the user and system text and replace the variables with the values provided at runtime relying on a given TemplateRenderer implementation. By default, Spring AI uses the StTemplateRenderer implementation, which is based on the open-source StringTemplate engine developed by Terence Parr.

Spring AI 还提供了 NoOpTemplateRenderer ,用于不需要模板处理的情况。

Spring AI also provides a NoOpTemplateRenderer for cases where no template processing is desired.

Spring AI 还提供了 NoOpTemplateRenderer

Spring AI also provides a NoOpTemplateRenderer.

直接在 ChatClient 上配置的 TemplateRenderer (通过 .templateRenderer() )仅适用于直接在 ChatClient 构建器链中定义的提示内容(例如,通过 .user().system() )。它 not 影响 Advisors 内部使用的模板,如 QuestionAnswerAdvisor ,它们有自己的模板自定义机制(参见 Custom Advisor Templates )。

The TemplateRenderer configured directly on the ChatClient (via .templateRenderer()) applies only to the prompt content defined directly in the ChatClient builder chain (e.g., via .user(), .system()). It does not affect templates used internally by Advisors like QuestionAnswerAdvisor, which have their own template customization mechanisms (see Custom Advisor Templates).

如果您想使用不同的模板引擎,您可以直接向 ChatClient 提供 TemplateRenderer 接口的自定义实现。您也可以继续使用默认的 StTemplateRenderer ,但使用自定义配置。

If you’d rather use a different template engine, you can provide a custom implementation of the TemplateRenderer interface directly to the ChatClient. You can also keep using the default StTemplateRenderer, but with a custom configuration.

例如,默认情况下,模板变量由 {} 语法标识。如果您打算在提示中包含 JSON,您可能希望使用不同的语法来避免与 JSON 语法冲突。例如,您可以使用 <> 分隔符。

For example, by default, template variables are identified by the {} syntax. If you’re planning to include JSON in your prompt, you might want to use a different syntax to avoid conflicts with JSON syntax. For example, you can use the < and > delimiters.

String answer = ChatClient.create(chatModel).prompt()
    .user(u -> u
            .text("Tell me the names of 5 movies whose soundtrack was composed by <composer>")
            .param("composer", "John Williams"))
    .templateRenderer(StTemplateRenderer.builder().startDelimiterToken('<').endDelimiterToken('>').build())
    .call()
    .content();

call() return values

ChatClient 上指定 call() 方法后,有几种不同的响应类型选项。

After specifying the call() method on ChatClient, there are a few different options for the response type.

  • String content() :返回响应的字符串内容

  • String content(): returns the String content of the response

  • ChatResponse chatResponse() :返回 ChatResponse 对象,其中包含多个生成以及有关响应的元数据,例如用于创建响应的令牌数量。

  • ChatResponse chatResponse(): returns the ChatResponse object that contains multiple generations and also metadata about the response, for example how many token were used to create the response.

  • ChatClientResponse chatClientResponse() :返回一个 ChatClientResponse 对象,其中包含 ChatResponse 对象和 ChatClient 执行上下文,使您能够访问在执行顾问期间使用的额外数据(例如在 RAG 流中检索到的相关文档)。

  • ChatClientResponse chatClientResponse(): returns a ChatClientResponse object that contains the ChatResponse object and the ChatClient execution context, giving you access to additional data used during the execution of advisors (e.g. the relevant documents retrieved in a RAG flow).

  • entity() 返回 Java 类型

    • entity(ParameterizedTypeReference&lt;T&gt; type) :用于返回实体类型的 Collection

    • entity(ParameterizedTypeReference<T> type): used to return a Collection of entity types.

    • entity(Class&lt;T&gt; type) :用于返回特定实体类型。

    • entity(Class<T> type): used to return a specific entity type.

    • entity(StructuredOutputConverter&lt;T&gt; structuredOutputConverter) :用于指定 StructuredOutputConverter 的实例,以将 String 转换为实体类型。

    • entity(StructuredOutputConverter<T> structuredOutputConverter): used to specify an instance of a StructuredOutputConverter to convert a String to an entity type.

  • entity() to return a Java type

    • entity(ParameterizedTypeReference&lt;T&gt; type) :用于返回实体类型的 Collection

    • entity(ParameterizedTypeReference<T> type): used to return a Collection of entity types.

    • entity(Class&lt;T&gt; type) :用于返回特定实体类型。

    • entity(Class<T> type): used to return a specific entity type.

    • entity(StructuredOutputConverter&lt;T&gt; structuredOutputConverter) :用于指定 StructuredOutputConverter 的实例,以将 String 转换为实体类型。

    • entity(StructuredOutputConverter<T> structuredOutputConverter): used to specify an instance of a StructuredOutputConverter to convert a String to an entity type.

您也可以调用 stream() 方法而不是 call()

You can also invoke the stream() method instead of call().

stream() return values

ChatClient 上指定 stream() 方法后,有几个响应类型选项:

After specifying the stream() method on ChatClient, there are a few options for the response type:

  • Flux&lt;String&gt; content() :返回 AI 模型生成的字符串的 Flux

  • Flux<String> content(): Returns a Flux of the string being generated by the AI model.

  • Flux&lt;ChatResponse&gt; chatResponse() :返回 ChatResponse 对象的 Flux ,其中包含有关响应的其他元数据。

  • Flux<ChatResponse> chatResponse(): Returns a Flux of the ChatResponse object, which contains additional metadata about the response.

  • Flux&lt;ChatClientResponse&gt; chatClientResponse() :返回 ChatClientResponse 对象的 Flux ,其中包含 ChatResponse 对象和 ChatClient 执行上下文,使您能够访问在执行顾问期间使用的额外数据(例如在 RAG 流中检索到的相关文档)。

  • Flux<ChatClientResponse> chatClientResponse(): returns a Flux of the ChatClientResponse object that contains the ChatResponse object and the ChatClient execution context, giving you access to additional data used during the execution of advisors (e.g. the relevant documents retrieved in a RAG flow).

Using Defaults

@Configuration 类中创建带有默认系统文本的 ChatClient 简化了运行时代码。通过设置默认值,您在调用 ChatClient 时只需指定用户文本,从而无需在运行时代码路径中为每个请求设置系统文本。

Creating a ChatClient with a default system text in an @Configuration class simplifies runtime code. By setting defaults, you only need to specify the user text when calling ChatClient, eliminating the need to set a system text for each request in your runtime code path.

Default System Text

在以下示例中,我们将配置系统文本以始终以海盗的口吻回复。为了避免在运行时代码中重复系统文本,我们将在 @Configuration 类中创建一个 ChatClient 实例。

In the following example, we will configure the system text to always reply in a pirate’s voice. To avoid repeating the system text in runtime code, we will create a ChatClient instance in a @Configuration class.

@Configuration
class Config {

    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("You are a friendly chat bot that answers question in the voice of a Pirate")
                .build();
    }

}

以及一个 @RestController 来调用它:

and a @RestController to invoke it:

@RestController
class AIController {

	private final ChatClient chatClient;

	AIController(ChatClient chatClient) {
		this.chatClient = chatClient;
	}

	@GetMapping("/ai/simple")
	public Map<String, String> completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
		return Map.of("completion", this.chatClient.prompt().user(message).call().content());
	}
}

当通过 curl 调用应用程序端点时,结果是:

When calling the application endpoint via curl, the result is:

❯ curl localhost:8080/ai/simple
{"completion":"Why did the pirate go to the comedy club? To hear some arrr-rated jokes! Arrr, matey!"}

Default System Text with parameters

在以下示例中,我们将在系统文本中使用占位符来在运行时而不是设计时指定补全的语音。

In the following example, we will use a placeholder in the system text to specify the voice of the completion at runtime instead of design time.

@Configuration
class Config {

    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder.defaultSystem("You are a friendly chat bot that answers question in the voice of a {voice}")
                .build();
    }

}
@RestController
class AIController {
	private final ChatClient chatClient;

	AIController(ChatClient chatClient) {
		this.chatClient = chatClient;
	}

	@GetMapping("/ai")
	Map<String, String> completion(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message, String voice) {
		return Map.of("completion",
				this.chatClient.prompt()
						.system(sp -> sp.param("voice", voice))
						.user(message)
						.call()
						.content());
	}

}

当通过 httpie 调用应用程序端点时,结果是:

When calling the application endpoint via httpie, the result is:

http localhost:8080/ai voice=='Robert DeNiro'
{
    "completion": "You talkin' to me? Okay, here's a joke for ya: Why couldn't the bicycle stand up by itself? Because it was two tired! Classic, right?"
}

Other defaults

ChatClient.Builder 级别,您可以指定默认的提示配置。

At the ChatClient.Builder level, you can specify the default prompt configuration.

  • defaultOptions(ChatOptions chatOptions) :传入 ChatOptions 类中定义的便携式选项或模型特定选项,例如 OpenAiChatOptions 中的选项。有关模型特定 ChatOptions 实现的更多信息,请参阅 JavaDocs。

  • defaultOptions(ChatOptions chatOptions): Pass in either portable options defined in the ChatOptions class or model-specific options such as those in OpenAiChatOptions. For more information on model-specific ChatOptions implementations, refer to the JavaDocs.

  • defaultFunction(String name, String description, java.util.function.Function&lt;I, O&gt; function)name 用于指代用户文本中的函数。 description 解释了函数的作用,并帮助 AI 模型选择正确的函数以获得准确的响应。 function 参数是一个 Java 函数实例,模型在必要时将执行该实例。

  • defaultFunction(String name, String description, java.util.function.Function<I, O> function): The name is used to refer to the function in user text. The description explains the function’s purpose and helps the AI model choose the correct function for an accurate response. The function argument is a Java function instance that the model will execute when necessary.

  • defaultFunctions(String&#8230;&#8203; functionNames) :应用程序上下文中定义的 java.util.Function 的 bean 名称。

  • defaultFunctions(String…​ functionNames): The bean names of `java.util.Function`s defined in the application context.

  • defaultUser(String text)defaultUser(Resource text)defaultUser(Consumer&lt;UserSpec&gt; userSpecConsumer) :这些方法允许您定义用户文本。 Consumer&lt;UserSpec&gt; 允许您使用 lambda 表达式指定用户文本和任何默认参数。

  • defaultUser(String text), defaultUser(Resource text), defaultUser(Consumer<UserSpec> userSpecConsumer): These methods let you define the user text. The Consumer<UserSpec> allows you to use a lambda to specify the user text and any default parameters.

  • defaultAdvisors(Advisor&#8230;&#8203; advisor) :Advisors 允许修改用于创建 Prompt 的数据。 QuestionAnswerAdvisor 实现通过将提示与与用户文本相关的上下文信息附加来启用 Retrieval Augmented Generation 模式。

  • defaultAdvisors(Advisor…​ advisor): Advisors allow modification of the data used to create the Prompt. The QuestionAnswerAdvisor implementation enables the pattern of Retrieval Augmented Generation by appending the prompt with context information related to the user text.

  • defaultAdvisors(Consumer&lt;AdvisorSpec&gt; advisorSpecConsumer) :此方法允许您定义一个 Consumer ,以使用 AdvisorSpec 配置多个 advisors。Advisors 可以修改用于创建最终 Prompt 的数据。 Consumer&lt;AdvisorSpec&gt; 允许您指定一个 lambda 表达式来添加 advisors,例如 QuestionAnswerAdvisor ,它通过将提示与基于用户文本的相关上下文信息附加来支持 Retrieval Augmented Generation

  • defaultAdvisors(Consumer<AdvisorSpec> advisorSpecConsumer): This method allows you to define a Consumer to configure multiple advisors using the AdvisorSpec. Advisors can modify the data used to create the final Prompt. The Consumer<AdvisorSpec> lets you specify a lambda to add advisors, such as QuestionAnswerAdvisor, which supports Retrieval Augmented Generation by appending the prompt with relevant context information based on the user text.

您可以使用不带 default 前缀的相应方法在运行时覆盖这些默认值。

You can override these defaults at runtime using the corresponding methods without the default prefix.

  • options(ChatOptions chatOptions)

  • function(String name, String description, java.util.function.Function&lt;I, O&gt; function)

  • function(String name, String description, java.util.function.Function<I, O> function)

  • functions(String…​ functionNames)

  • user(String text)user(Resource text)user(Consumer&lt;UserSpec&gt; userSpecConsumer)

  • user(String text), user(Resource text), user(Consumer<UserSpec> userSpecConsumer)

  • advisors(Advisor…​ advisor)

  • advisors(Consumer<AdvisorSpec> advisorSpecConsumer)

Advisors

Advisors API 提供了一种灵活而强大的方式来拦截、修改和增强 Spring 应用程序中 AI 驱动的交互。

The Advisors API provides a flexible and powerful way to intercept, modify, and enhance AI-driven interactions in your Spring applications.

当使用用户文本调用 AI 模型时,一种常见的模式是附加或增强提示与上下文数据。

A common pattern when calling an AI model with user text is to append or augment the prompt with contextual data.

此上下文数据可以是不同类型的。常见类型包括:

This contextual data can be of different types. Common types include:

  • Your own data :这是 AI 模型尚未训练过的数据。即使模型已经见过类似的数据,附加的上下文数据在生成响应时也优先。

  • Your own data: This is data the AI model hasn’t been trained on. Even if the model has seen similar data, the appended contextual data takes precedence in generating the response.

  • @ {s0}:聊天模型的API是无状态的。如果你告诉AI模型你的名字,它在后续的交互中不会记住。每次请求都必须发送会话历史,以确保在生成响应时考虑先前的交互。

  • Conversational history: The chat model’s API is stateless. If you tell the AI model your name, it won’t remember it in subsequent interactions. Conversational history must be sent with each request to ensure previous interactions are considered when generating a response.

Advisor Configuration in ChatClient

ChatClient 流式 API 提供了一个 AdvisorSpec 接口,用于配置顾问。此接口提供添加参数、一次设置多个参数以及向链中添加一个或多个顾问的方法。

The ChatClient fluent API provides an AdvisorSpec interface for configuring advisors. This interface offers methods to add parameters, set multiple parameters at once, and add one or more advisors to the chain.

interface AdvisorSpec {
    AdvisorSpec param(String k, Object v);
    AdvisorSpec params(Map<String, Object> p);
    AdvisorSpec advisors(Advisor... advisors);
    AdvisorSpec advisors(List<Advisor> advisors);
}

顾问添加到链中的顺序至关重要,因为它决定了它们的执行顺序。每个顾问都会以某种方式修改提示或上下文,一个顾问所做的更改会传递给链中的下一个顾问。

The order in which advisors are added to the chain is crucial, as it determines the sequence of their execution. Each advisor modifies the prompt or the context in some way, and the changes made by one advisor are passed on to the next in the chain.

ChatClient.builder(chatModel)
    .build()
    .prompt()
    .advisors(
        MessageChatMemoryAdvisor.builder(chatMemory).build(),
        QuestionAnswerAdvisor.builder(vectorStore).build()
    )
    .user(userText)
    .call()
    .content();

在此配置中, MessageChatMemoryAdvisor 将首先执行,将对话历史添加到提示中。然后, QuestionAnswerAdvisor 将根据用户的提问和添加的对话历史执行搜索,可能会提供更相关的结果。

In this configuration, the MessageChatMemoryAdvisor will be executed first, adding the conversation history to the prompt. Then, the QuestionAnswerAdvisor will perform its search based on the user’s question and the added conversation history, potentially providing more relevant results.

Retrieval Augmented Generation

请参阅 Retrieval Augmented Generation 指南。

Refer to the Retrieval Augmented Generation guide.

Logging

SimpleLoggerAdvisor 是一个顾问,它记录 ChatClientrequestresponse 数据。这对于调试和监控 AI 交互非常有用。

The SimpleLoggerAdvisor is an advisor that logs the request and response data of the ChatClient. This can be useful for debugging and monitoring your AI interactions.

Spring AI 支持 LLM 和向量存储交互的可观测性。有关更多信息,请参阅 Observability 指南。

Spring AI supports observability for LLM and vector store interactions. Refer to the Observability guide for more information.

要启用日志记录,请在创建 ChatClient 时将 SimpleLoggerAdvisor 添加到顾问链中。建议将其添加到链的末尾:

To enable logging, add the SimpleLoggerAdvisor to the advisor chain when creating your ChatClient. It’s recommended to add it toward the end of the chain:

ChatResponse response = ChatClient.create(chatModel).prompt()
        .advisors(new SimpleLoggerAdvisor())
        .user("Tell me a joke?")
        .call()
        .chatResponse();

要查看日志,请将顾问包的日志级别设置为 DEBUG

To see the logs, set the logging level for the advisor package to DEBUG:

logging.level.org.springframework.ai.chat.client.advisor=DEBUG

将其添加到您的 application.propertiesapplication.yaml 文件中。

Add this to your application.properties or application.yaml file.

您可以通过使用以下构造函数来自定义从 AdvisedRequestChatResponse 中记录哪些数据:

You can customize what data from AdvisedRequest and ChatResponse is logged by using the following constructor:

SimpleLoggerAdvisor(
    Function<AdvisedRequest, String> requestToString,
    Function<ChatResponse, String> responseToString
)

示例用法:

Example usage:

SimpleLoggerAdvisor customLogger = new SimpleLoggerAdvisor(
    request -> "Custom request: " + request.userText,
    response -> "Custom response: " + response.getResult()
);

这允许您根据特定需求定制记录的信息。

This allows you to tailor the logged information to your specific needs.

在生产环境中记录敏感信息时要谨慎。

Be cautious about logging sensitive information in production environments.

Chat Memory

接口 ChatMemory 表示聊天对话内存的存储。它提供向对话添加消息、从对话检索消息以及清除对话历史的方法。

The interface ChatMemory represents a storage for chat conversation memory. It provides methods to add messages to a conversation, retrieve messages from a conversation, and clear the conversation history.

目前有一个内置实现: MessageWindowChatMemory

There is currently one built-in implementation: MessageWindowChatMemory.

MessageWindowChatMemory 是一个聊天内存实现,它维护一个消息窗口,最大大小(默认值:20 条消息)。当消息数量超过此限制时,较旧的消息将被驱逐,但系统消息会保留。如果添加了新的系统消息,所有以前的系统消息都将从内存中删除。这确保了对话始终可以使用最新的上下文,同时保持内存使用量在一定范围内。

MessageWindowChatMemory is a chat memory implementation that maintains a window of messages up to a specified maximum size (default: 20 messages). When the number of messages exceeds this limit, older messages are evicted, but system messages are preserved. If a new system message is added, all previous system messages are removed from memory. This ensures that the most recent context is always available for the conversation while keeping memory usage bounded.

MessageWindowChatMemoryChatMemoryRepository 抽象支持,该抽象为聊天对话内存提供存储实现。有几种可用的实现,包括 InMemoryChatMemoryRepositoryJdbcChatMemoryRepositoryCassandraChatMemoryRepositoryNeo4jChatMemoryRepository

The MessageWindowChatMemory is backed by the ChatMemoryRepository abstraction which provides storage implementations for the chat conversation memory. There are several implementations available, including the InMemoryChatMemoryRepository, JdbcChatMemoryRepository, CassandraChatMemoryRepository and Neo4jChatMemoryRepository.

有关更多详细信息和使用示例,请参阅 Chat Memory 文档。

For more details and usage examples, see the Chat Memory documentation.

Implementation Notes

ChatClient 中结合使用命令式和反应式编程模型是 API 的一个独特方面。通常,应用程序要么是反应式,要么是命令式,但不会两者兼有。

The combined use of imperative and reactive programming models in ChatClient is a unique aspect of the API. Often an application will be either reactive or imperative, but not both.

  • 当自定义模型实现的 HTTP 客户端交互时,必须同时配置 RestClient 和 WebClient。

  • When customizing the HTTP client interactions of a Model implementation, both the RestClient and the WebClient must be configured.

由于 Spring Boot 3.4 中的一个错误,必须设置 “spring.http.client.factory=jdk” 属性。否则,它默认为“reactor”,这会破坏某些 AI 工作流,例如 ImageModel。

Due to a bug in Spring Boot 3.4, the "spring.http.client.factory=jdk" property must be set. Otherwise, it’s set to "reactor" by default, which breaks certain AI workflows like the ImageModel.

  • 流式传输仅通过 Reactive 堆栈支持。因此,命令式应用程序必须包含 Reactive 堆栈(例如 spring-boot-starter-webflux)。

  • Streaming is only supported via the Reactive stack. Imperative applications must include the Reactive stack for this reason (e.g. spring-boot-starter-webflux).

  • 非流式传输仅通过 Servlet 堆栈支持。因此,Reactive 应用程序必须包含 Servlet 堆栈(例如 spring-boot-starter-web)并预计某些调用会阻塞。

  • Non-streaming is only supportive via the Servlet stack. Reactive applications must include the Servlet stack for this reason (e.g. spring-boot-starter-web) and expect some calls to be blocking.

  • 工具调用是命令式的,导致阻塞工作流。这还会导致部分/中断的 Micrometer 观察(例如 ChatClient 跨度和工具调用跨度未连接,第一个因此保持不完整)。

  • Tool calling is imperative, leading to blocking workflows. This also results in partial/interrupted Micrometer observations (e.g. the ChatClient spans and the tool calling spans are not connected, with the first one remaining incomplete for that reason).

  • 内置的顾问针对标准调用执行阻塞操作,针对流式调用执行非阻塞操作。用于顾问流式调用的 Reactor 调度程序可以通过每个 Advisor 类上的 Builder 进行配置。

  • The built-in advisors perform blocking operations for standards calls, and non-blocking operations for streaming calls. The Reactor Scheduler used for the advisor streaming calls can be configured via the Builder on each Advisor class.