Advisors API

Spring AI Advisors API 提供了一种灵活而强大的方式来拦截、修改和增强 Spring 应用程序中 AI 驱动的交互。通过利用 Advisors API,开发人员可以创建更复杂、可重用和可维护的 AI 组件。

The Spring AI Advisors API provides a flexible and powerful way to intercept, modify, and enhance AI-driven interactions in your Spring applications. By leveraging the Advisors API, developers can create more sophisticated, reusable, and maintainable AI components.

主要优势包括封装常见的生成式 AI 模式,转换发送到大型语言模型 (LLM) 和从大型语言模型 (LLM) 接收的数据,以及提供跨各种模型和用例的可移植性。

The key benefits include encapsulating recurring Generative AI patterns, transforming data sent to and from Large Language Models (LLMs), and providing portability across various models and use cases.

您可以使用 ChatClient API 配置现有顾问,示例如下:

You can configure existing advisors using the ChatClient API as shown in the following example:

var chatClient = ChatClient.builder(chatModel)
    .defaultAdvisors(
        MessageChatMemoryAdvisor.builder(chatMemory).build(), // chat-memory advisor
        QuestionAnswerAdvisor.builder((vectorStore).builder() // RAG advisor
    )
    .build();

var conversationId = "678";

String response = this.chatClient.prompt()
    // Set advisor parameters at runtime
    .advisors(advisor -> advisor.param(ChatMemory.CONVERSATION_ID, conversationId))
    .user(userText)
    .call()
	.content();

建议在构建时使用构建器的 defaultAdvisors() 方法注册顾问。

It is recommend to register the advisors at build time using builder’s defaultAdvisors() method.

顾问还参与可观察性堆栈,因此您可以查看与其执行相关的指标和跟踪。

Advisors also participate in the Observability stack, so you can view metrics and traces related to their execution.

Core Components

API 包括用于非流式传输场景的 CallAroundAdvisorCallAroundAdvisorChain ,以及用于流式传输场景的 StreamAroundAdvisorStreamAroundAdvisorChain 。它还包括 AdvisedRequest 用于表示未密封的提示请求, AdvisedResponse 用于聊天完成响应。两者都包含一个 advise-context 以在顾问链中共享状态。

The API consists of CallAroundAdvisor and CallAroundAdvisorChain for non-streaming scenarios, and StreamAroundAdvisor and StreamAroundAdvisorChain for streaming scenarios. It also includes AdvisedRequest to represent the unsealed Prompt request, AdvisedResponse for the Chat Completion response. Both hold an advise-context to share state across the advisor chain.

advisors api classes

nextAroundCall()nextAroundStream() 是关键的顾问方法,通常执行以下操作:检查未密封的提示数据,自定义和增强提示数据,调用顾问链中的下一个实体,可选地阻止请求,检查聊天完成响应,以及抛出异常以指示处理错误。

The nextAroundCall() and the nextAroundStream() are the key advisor methods, typically performing actions such as examining the unsealed Prompt data, customizing and augmenting the Prompt data, invoking the next entity in the advisor chain, optionally blocking the request, examining the chat completion response, and throwing exceptions to indicate processing errors.

除了 getOrder() 方法决定链中顾问的顺序,而 getName() 提供唯一的顾问名称。

In addition the getOrder() method determines advisor order in the chain, while getName() provides a unique advisor name.

由 Spring AI 框架创建的顾问链允许按其 getOrder() 值排序的多个顾问的顺序调用。较低的值首先执行。自动添加的最后一个顾问将请求发送到 LLM。

The Advisor Chain, created by the Spring AI framework, allows sequential invocation of multiple advisors ordered by their getOrder() values. The lower values are executed first. The last advisor, added automatically, sends the request to the LLM.

以下流程图说明了顾问链和聊天模型之间的交互:

Following flow diagram illustrates the interaction between the advisor chain and the Chat Model:

advisors flow
  1. Spring AI 框架根据用户的 Prompt 和一个空的 AdvisorContext 对象创建了一个 AdvisedRequest

  2. The Spring AI framework creates an AdvisedRequest from user’s Prompt along with an empty AdvisorContext object.

  3. 链中的每个顾问处理请求,可能会修改它。或者,它可以选择通过不调用下一个实体来阻止请求。在后一种情况下,顾问负责填写响应。

  4. Each advisor in the chain processes the request, potentially modifying it. Alternatively, it can choose to block the request by not making the call to invoke the next entity. In the latter case, the advisor is responsible for filling out the response.

  5. 框架提供的最终顾问将请求发送到 Chat Model

  6. The final advisor, provided by the framework, sends the request to the Chat Model.

  7. 聊天模型的响应然后通过顾问链传回并转换为 AdvisedResponse 。后者包括共享的 AdvisorContext 实例。

  8. The Chat Model’s response is then passed back through the advisor chain and converted into AdvisedResponse. Later includes the shared AdvisorContext instance.

  9. 每个顾问都可以处理或修改响应。

  10. Each advisor can process or modify the response.

  11. 最终的 AdvisedResponse 通过提取 ChatCompletion 返回给客户端。

  12. The final AdvisedResponse is returned to the client by extracting the ChatCompletion.

Advisor Order

链中顾问的执行顺序由 getOrder() 方法决定。要理解的关键点:

The execution order of advisors in the chain is determined by the getOrder() method. Key points to understand:

  • 具有较低顺序值的顾问首先执行。

  • Advisors with lower order values are executed first.

  • 顾问链作为堆栈运行:

    • 链中的第一个顾问是第一个处理请求的。

    • The first advisor in the chain is the first to process the request.

    • 它也是最后一个处理响应的。

    • It is also the last to process the response.

  • The advisor chain operates as a stack:

    • 链中的第一个顾问是第一个处理请求的。

    • The first advisor in the chain is the first to process the request.

    • 它也是最后一个处理响应的。

    • It is also the last to process the response.

  • To control execution order:

    • 将顺序设置在接近 Ordered.HIGHEST_PRECEDENCE 的位置,以确保顾问在链中首先执行(请求处理时是第一个,响应处理时是最后一个)。

    • Set the order close to Ordered.HIGHEST_PRECEDENCE to ensure an advisor is executed first in the chain (first for request processing, last for response processing).

    • 将顺序设置在接近 Ordered.LOWEST_PRECEDENCE 的位置,以确保顾问在链中最后执行(请求处理时是最后一个,响应处理时是第一个)。

    • Set the order close to Ordered.LOWEST_PRECEDENCE to ensure an advisor is executed last in the chain (last for request processing, first for response processing).

  • 较高的值被解释为较低的优先级。

  • Higher values are interpreted as lower priority.

  • 如果多个顾问具有相同的顺序值,则它们的执行顺序不保证。

  • If multiple advisors have the same order value, their execution order is not guaranteed.

命令和执行顺序之间看似矛盾的原因在于顾问链的堆栈式性质:

The seeming contradiction between order and execution sequence is due to the stack-like nature of the advisor chain:

  • 具有最高优先级(最低顺序值)的顾问被添加到堆栈的顶部。

  • An advisor with the highest precedence (lowest order value) is added to the top of the stack.

  • 当堆栈展开时,它将是第一个处理请求的。

  • It will be the first to process the request as the stack unwinds.

  • 当堆栈回卷时,它将是最后一个处理响应的。

  • It will be the last to process the response as the stack rewinds.

提醒一下,以下是 Spring Ordered 接口的语义:

As a reminder, here are the semantics of the Spring Ordered interface:

public interface Ordered {

    /**
     * Constant for the highest precedence value.
     * @see java.lang.Integer#MIN_VALUE
     */
    int HIGHEST_PRECEDENCE = Integer.MIN_VALUE;

    /**
     * Constant for the lowest precedence value.
     * @see java.lang.Integer#MAX_VALUE
     */
    int LOWEST_PRECEDENCE = Integer.MAX_VALUE;

    /**
     * Get the order value of this object.
     * <p>Higher values are interpreted as lower priority. As a consequence,
     * the object with the lowest value has the highest priority (somewhat
     * analogous to Servlet {@code load-on-startup} values).
     * <p>Same order values will result in arbitrary sort positions for the
     * affected objects.
     * @return the order value
     * @see #HIGHEST_PRECEDENCE
     * @see #LOWEST_PRECEDENCE
     */
    int getOrder();
}

对于需要在输入和输出端都在链中排在第一位的用例:

For use cases that need to be first in the chain on both the input and output sides:

  1. 为每一端使用单独的顾问。

  2. Use separate advisors for each side.

  3. 配置它们具有不同的顺序值。

  4. Configure them with different order values.

  5. 使用顾问上下文在它们之间共享状态。

  6. Use the advisor context to share state between them.

API Overview

主要的 Advisor 接口位于包 org.springframework.ai.chat.client.advisor.api 中。以下是您在创建自己的顾问时将遇到的关键接口:

The main Advisor interfaces are located in the package org.springframework.ai.chat.client.advisor.api. Here are the key interfaces you’ll encounter when creating your own advisor:

public interface Advisor extends Ordered {

	String getName();

}

同步和响应式 Advisor 的两个子接口是

The two sub-interfaces for synchronous and reactive Advisors are

public interface CallAroundAdvisor extends Advisor {

	/**
	 * Around advice that wraps the ChatModel#call(Prompt) method.
	 * @param advisedRequest the advised request
	 * @param chain the advisor chain
	 * @return the response
	 */
	AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain);

}

and

public interface StreamAroundAdvisor extends Advisor {

	/**
	 * Around advice that wraps the invocation of the advised request.
	 * @param advisedRequest the advised request
	 * @param chain the chain of advisors to execute
	 * @return the result of the advised request
	 */
	Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain);

}

要在 Advice 实现中继续 Advice 链,请使用 CallAroundAdvisorChainStreamAroundAdvisorChain

To continue the chain of Advice, use CallAroundAdvisorChain and StreamAroundAdvisorChain in your Advice implementation:

接口是

The interfaces are

public interface CallAroundAdvisorChain {

	AdvisedResponse nextAroundCall(AdvisedRequest advisedRequest);

}

and

public interface StreamAroundAdvisorChain {

	Flux<AdvisedResponse> nextAroundStream(AdvisedRequest advisedRequest);

}

Implementing an Advisor

要创建顾问,请实现 CallAroundAdvisorStreamAroundAdvisor (或两者)。要实现的关键方法是用于非流式顾问的 nextAroundCall() 或用于流式顾问的 nextAroundStream()

To create an advisor, implement either CallAroundAdvisor or StreamAroundAdvisor (or both). The key method to implement is nextAroundCall() for non-streaming or nextAroundStream() for streaming advisors.

Examples

我们将提供一些动手示例来说明如何实现用于观察和增强用例的顾问。

We will provide few hands-on examples to illustrate how to implement advisors for observing and augmenting use-cases.

Logging Advisor

我们可以实现一个简单的日志顾问,它在调用链中下一个顾问之前记录 AdvisedRequest ,并在调用之后记录 AdvisedResponse 。请注意,顾问只观察请求和响应,不修改它们。此实现支持非流式和流式场景。

We can implement a simple logging advisor that logs the AdvisedRequest before and the AdvisedResponse after the call to the next advisor in the chain. Note that the advisor only observes the request and response and does not modify them. This implementation support both non-streaming and streaming scenarios.

public class SimpleLoggerAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {

	private static final Logger logger = LoggerFactory.getLogger(SimpleLoggerAdvisor.class);

	@Override
	public String getName() { (1)
		return this.getClass().getSimpleName();
	}

	@Override
	public int getOrder() { (2)
		return 0;
	}

	@Override
	public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {

		logger.debug("BEFORE: {}", advisedRequest);

		AdvisedResponse advisedResponse = chain.nextAroundCall(advisedRequest);

		logger.debug("AFTER: {}", advisedResponse);

		return advisedResponse;
	}

	@Override
	public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {

		logger.debug("BEFORE: {}", advisedRequest);

		Flux<AdvisedResponse> advisedResponses = chain.nextAroundStream(advisedRequest);

        return new MessageAggregator().aggregateAdvisedResponse(advisedResponses,
                    advisedResponse -> logger.debug("AFTER: {}", advisedResponse)); (3)
	}
}
1 为顾问提供唯一的名称。
2 Provides a unique name for the advisor.
3 您可以通过设置顺序值来控制执行顺序。值越低,执行越早。
4 You can control the order of execution by setting the order value. Lower values execute first.
5 ` MessageAggregator ` 是一个实用程序类,它将 Flux 响应聚合到单个 AdvisedResponse 中。这对于日志记录或其他观察整个响应而非流中单个项目的处理非常有用。请注意,您不能在 ` MessageAggregator ` 中更改响应,因为它是一个只读操作。
6 The MessageAggregator is a utility class that aggregates the Flux responses into a single AdvisedResponse. This can be useful for logging or other processing that observe the entire response rather than individual items in the stream. Note that you can not alter the response in the MessageAggregator as it is a read-only operation.

Re-Reading (Re2) Advisor

Re-Reading Improves Reasoning in Large Language Models 文章介绍了一种称为重读 (Re2) 的技术,该技术可以提高大型语言模型的推理能力。Re2 技术需要像这样增强输入提示:

The "Re-Reading Improves Reasoning in Large Language Models" article introduces a technique called Re-Reading (Re2) that improves the reasoning capabilities of Large Language Models. The Re2 technique requires augmenting the input prompt like this:

{Input_Query}
Read the question again: {Input_Query}

实现一个将 Re2 技术应用于用户输入查询的顾问可以这样做:

Implementing an advisor that applies the Re2 technique to the user’s input query can be done like this:

public class ReReadingAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {


	private AdvisedRequest before(AdvisedRequest advisedRequest) { (1)

		Map<String, Object> advisedUserParams = new HashMap<>(advisedRequest.userParams());
		advisedUserParams.put("re2_input_query", advisedRequest.userText());

		return AdvisedRequest.from(advisedRequest)
			.userText("""
			    {re2_input_query}
			    Read the question again: {re2_input_query}
			    """)
			.userParams(advisedUserParams)
			.build();
	}

	@Override
	public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) { (2)
		return chain.nextAroundCall(this.before(advisedRequest));
	}

	@Override
	public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) { (3)
		return chain.nextAroundStream(this.before(advisedRequest));
	}

	@Override
	public int getOrder() { (4)
		return 0;
	}

    @Override
    public String getName() { (5)
		return this.getClass().getSimpleName();
	}
}
1 ` before ` 方法通过应用重读技术来增强用户的输入查询。
2 The before method augments the user’s input query applying the Re-Reading technique.
3 ` aroundCall ` 方法拦截非流式请求并应用重读技术。
4 The aroundCall method intercepts the non-streaming request and applies the Re-Reading technique.
5 ` aroundStream ` 方法拦截流式请求并应用重读技术。
6 The aroundStream method intercepts the streaming request and applies the Re-Reading technique.
7 您可以通过设置顺序值来控制执行顺序。值越低,执行越早。
8 You can control the order of execution by setting the order value. Lower values execute first.
9 为顾问提供唯一的名称。
10 Provides a unique name for the advisor.

Spring AI Built-in Advisors

Spring AI 框架提供了几个内置顾问来增强您的 AI 交互。以下是可用顾问的概述:

Spring AI framework provides several built-in advisors to enhance your AI interactions. Here’s an overview of the available advisors:

Chat Memory Advisors

这些顾问在聊天内存存储中管理会话历史记录:

These advisors manage conversation history in a chat memory store:

  • MessageChatMemoryAdvisor[.iokays-translated-9d3c8fe10c10c8c353d6947077cbe151] 检索内存并将其作为消息集合添加到提示中。此方法维护会话历史记录的结构。请注意,并非所有 AI 模型都支持此方法。

Retrieves memory and adds it as a collection of messages to the prompt. This approach maintains the structure of the conversation history. Note, not all AI Models support this approach.

  • PromptChatMemoryAdvisor[.iokays-translated-55ddfc1bbf2d0d42b754c8bc91e1239c] 检索内存并将其合并到提示的系统文本中。

Retrieves memory and incorporates it into the prompt’s system text.

  • VectorStoreChatMemoryAdvisor[.iokays-translated-6d7faf6e3483af04ea7881283f40b8f1] 从 VectorStore 检索内存并将其添加到提示的系统文本中。此顾问对于从大型数据集中高效搜索和检索相关信息非常有用。

Retrieves memory from a VectorStore and adds it into the prompt’s system text. This advisor is useful for efficiently searching and retrieving relevant information from large datasets.

Question Answering Advisor
  • QuestionAnswerAdvisor[.iokays-translated-c6bd85694d566265196fa597d049990c] 此顾问使用向量存储提供问答功能,实现 RAG(检索增强生成)模式。

This advisor uses a vector store to provide question-answering capabilities, implementing the RAG (Retrieval-Augmented Generation) pattern.

Content Safety Advisor
  • SafeGuardAdvisor[.iokays-translated-188035b78bbfab7dec18c9934f272edd] 一个简单的顾问,旨在防止模型生成有害或不当内容。

A simple advisor designed to prevent the model from generating harmful or inappropriate content.

Streaming vs Non-Streaming

advisors non stream vs stream
  • 非流式顾问处理完整的请求和响应。

  • Non-streaming advisors work with complete requests and responses.

  • 流式顾问使用响应式编程概念(例如,用于响应的 Flux)将请求和响应作为连续流处理。

  • Streaming advisors handle requests and responses as continuous streams, using reactive programming concepts (e.g., Flux for responses).

@Override
public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {

    return  Mono.just(advisedRequest)
            .publishOn(Schedulers.boundedElastic())
            .map(request -> {
                // This can be executed by blocking and non-blocking Threads.
                // Advisor before next section
            })
            .flatMapMany(request -> chain.nextAroundStream(request))
            .map(response -> {
                // Advisor after next section
            });
}

Best Practices

  1. 让顾问专注于特定任务以获得更好的模块化。

  2. Keep advisors focused on specific tasks for better modularity.

  3. 必要时使用 ` adviseContext ` 在顾问之间共享状态。

  4. Use the adviseContext to share state between advisors when necessary.

  5. 实现您的顾问的流式和非流式版本以获得最大的灵活性。

  6. Implement both streaming and non-streaming versions of your advisor for maximum flexibility.

  7. 请仔细考虑链中顾问的顺序,以确保正确的数据流。

  8. Carefully consider the order of advisors in your chain to ensure proper data flow.

Backward Compatibility

AdvisedRequest 类已移至新包。

The AdvisedRequest class is moved to a new package.

Breaking API Changes

Spring AI 顾问链从 1.0 M2 版本到 1.0 M3 版本发生了重大变化。以下是主要修改:

The Spring AI Advisor Chain underwent significant changes from version 1.0 M2 to 1.0 M3. Here are the key modifications:

Advisor Interfaces

  • 在 1.0 M2 中,有单独的 RequestAdvisorResponseAdvisor 接口。

    • RequestAdvisorChatModel.callChatModel.stream 方法之前被调用。

    • RequestAdvisor was invoked before the ChatModel.call and ChatModel.stream methods.

    • ResponseAdvisor 在这些方法之后被调用。

    • ResponseAdvisor was called after these methods.

  • In 1.0 M2, there were separate RequestAdvisor and ResponseAdvisor interfaces.

    • RequestAdvisorChatModel.callChatModel.stream 方法之前被调用。

    • RequestAdvisor was invoked before the ChatModel.call and ChatModel.stream methods.

    • ResponseAdvisor 在这些方法之后被调用。

    • ResponseAdvisor was called after these methods.

  • 在 1.0 M3 中,这些接口已被替换为:

    • CallAroundAdvisor

    • StreamAroundAdvisor

  • In 1.0 M3, these interfaces have been replaced with:

    • CallAroundAdvisor

    • StreamAroundAdvisor

  • StreamResponseMode (以前是 ResponseAdvisor 的一部分)已被删除。

  • The StreamResponseMode, previously part of ResponseAdvisor, has been removed.

Context Map Handling

  • In 1.0 M2:

    • 上下文映射是一个单独的方法参数。

    • The context map was a separate method argument.

    • 该映射是可变的,并沿着链传递。

    • The map was mutable and passed along the chain.

  • In 1.0 M3:

    • 上下文映射现在是 AdvisedRequestAdvisedResponse 记录的一部分。

    • The context map is now part of the AdvisedRequest and AdvisedResponse records.

    • The map is immutable.

    • 要更新上下文,请使用 updateContext 方法,该方法会创建一个新的不可修改的映射,其中包含更新后的内容。

    • To update the context, use the updateContext method, which creates a new unmodifiable map with the updated contents.

1.0 M3 中更新上下文的示例:

Example of updating the context in 1.0 M3:

@Override
public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {

    this.advisedRequest = advisedRequest.updateContext(context -> {
        context.put("aroundCallBefore" + getName(), "AROUND_CALL_BEFORE " + getName());  // Add multiple key-value pairs
        context.put("lastBefore", getName());  // Add a single key-value pair
        return context;
    });

    // Method implementation continues...
}