Prompts
提示是指导 AI 模型生成特定输出的输入。这些提示的设计和措辞显著地影响了模型的响应。
Prompts are the inputs that guide an AI model to generate specific outputs. The design and phrasing of these prompts significantly influence the model’s responses.
在 Spring AI 中与 AI 模型交互的最低级别上,在 Spring AI 中处理提示有点类似于在 Spring MVC 中管理“视图”。这涉及使用动态内容的占位符创建大量文本。这些占位符随后基于用户请求或应用程序中的其他代码进行替换。另一个类似之处是包含用于特定表达式的占位符的 SQL 语句。
At the lowest level of interaction with AI models in Spring AI, handling prompts in Spring AI is somewhat similar to managing the "View" in Spring MVC. This involves creating extensive text with placeholders for dynamic content. These placeholders are then replaced based on user requests or other code in the application. Another analogy is a SQL statement that contain placeholders for certain expressions.
随着 Spring AI 的发展,它将为与 AI 模型交互引入更高的抽象级别。本节中描述的基础类在它们的角色和功能方面可以比作 JDBC。例如,ChatClient
类类似于 JDK 中的核心 JDBC 库。在此基础上,Spring AI 可以提供类似于 JdbcTemplate
、Spring Data Repositories,最终更高级的构造(如考虑与模型的过往交互的 ChatEngines 和 Agents)的帮助器类。
As Spring AI evolves, it will introduce higher levels of abstraction for interacting with AI models.
The foundational classes described in this section can be likened to JDBC in terms of their role and functionality.
The ChatClient
class, for instance, is analogous to the core JDBC library in the JDK.
Building upon this, Spring AI can provide helper classes similar to JdbcTemplate
, Spring Data Repositories, and eventually, more advanced constructs like ChatEngines and Agents that consider past interactions with the model.
提示的结构随着时间在 AI 领域内发生了演变。最初,提示是简单的字符串。随着时间的推移,它们逐渐包含用于特定输入的占位符,例如 AI 模型识别的“USER:”。OpenAI 通过在 AI 模型处理之前将多个消息字符串分类为不同的角色,为提示引入了更加结构化的内容。
The structure of prompts has evolved over time within the AI field. Initially, prompts were simple strings. Over time, they grew to include placeholders for specific inputs, like "USER:", which the AI model recognizes. OpenAI have introduced even more structure to prompts by categorizing multiple message strings into distinct roles before they are processed by the AI model.
API Overview
Prompt
通常使用 ChatClient
的 generate
方法,该方法接收 Prompt
实例并返回 ChatResponse
。
It is common to use the generate
method of ChatClient
that takes a Prompt
instance and returns an ChatResponse
.
Prompt
类用作有组织的 Message 对象序列的容器,每个对象组成整体提示的一部分。每个 Message 都体现了提示中一个独特的角色,其内容和意图各不相同。这些角色可以包含各种元素,从用户询问到 AI 生成的响应或相关背景信息。这种安排使与 AI 模型进行复杂且详细的交互成为可能,因为提示是由多条消息构建的,每条消息都在对话中扮演特定的角色。
The Prompt class functions as a container for an organized series of Message objects, with each one forming a segment of the overall prompt. Every Message embodies a unique role within the prompt, differing in its content and intent. These roles can encompass a variety of elements, from user inquiries to AI-generated responses or relevant background information. This arrangement enables intricate and detailed interactions with AI models, as the prompt is constructed from multiple messages, each assigned a specific role to play in the dialogue.
下面是 Prompt
类的截断版本,为简洁起见,省略了构造函数和实用方法:
Below is a truncated version of the Prompt class, with constructors and utility methods omitted for brevity:
public class Prompt {
private final List<Message> messages;
// constructors and utility methods omitted
}
Message
Message
接口封装一个文本消息、一个作为 Map
的属性集合、称为 MessageType
的分类以及一个用于多模态模型的媒体对象列表。该接口定义如下:
The Message
interface encapsulates a textual message, a collection of attributes as a Map
, a categorization known as MessageType
, and a list of media objects for those models that are multimodal.
The interface is defined as follows:
public interface Message {
String getContent();
List<Media> getMedia();
Map<String, Object> getProperties();
MessageType getMessageType();
}
Message
接口的各种实现对应于 AI 模型可以处理的不同类别的消息。某些模型(如来自 OpenAI 的模型)根据会话角色区分消息类别。这些角色实际上是由 MessageType
映射的,如下所述。
Various implementations of the Message
interface correspond to different categories of messages that an AI model can process. Some models, like those from OpenAI, distinguish between message categories based on conversational roles. These roles are effectively mapped by the MessageType
, as discussed below.
Roles
人工智能提示的演变从基本的、直接的文本发展到更具条理和复杂的格式,它具有特定的角色和结构。
The evolution of prompts in AI has transitioned from basic, straightforward text to more organized and complex formats with specific roles and structures.
最初,提示只是简单的字符串——也就是文本行。随着时间的推移,其发展到在这些字符串中包括特定的占位符,比如“USER:”,人工智能模型能够识别它并做出相应的回应。这是走向更结构化提示的一步。
Initially, prompts were simple strings – just lines of text. Over time, this evolved to include specific placeholders within these strings, like “USER:”, which the AI model could recognize and respond to accordingly. This was a step towards more structured prompts.
随后,OpenAI引入了一种更有条理的方法。在他们的模型中,提示不仅仅是单个字符串,还包括一系列消息。每条消息虽然仍以文本形式出现,但被分配了一个特定的角色。这些角色对消息进行分类,阐明了人工智能模型提示每一部分的背景和目的。这种结构化的方法增强了与人工智能交流的细微差别和有效性,因为提示的每一部分在交互中都扮演着不同而明确的角色。
OpenAI then introduced an even more organized approach. In their model, prompts are not merely single strings but a series of messages. Each message, while still in text form, is assigned a specific role. These roles categorize the messages, clarifying the context and purpose of each segment of the prompt for the AI model. This structured approach enhances the nuance and effectiveness of communication with the AI, as each part of the prompt plays a distinct and defined role in the interaction.
主要角色有:
The primary roles are:
-
System Role: Guides the AI’s behavior and response style, setting parameters or rules for how the AI interprets and replies to the input. It’s akin to providing instructions to the AI before initiating a conversation.
-
User Role: Represents the user’s input – their questions, commands, or statements to the AI. This role is fundamental as it forms the basis of the AI’s response.
-
Assistant Role: The AI’s response to the user’s input. More than just an answer or reaction, it’s crucial for maintaining the flow of the conversation. By tracking the AI’s previous responses (its 'Assistant Role' messages), the system ensures coherent and contextually relevant interactions.
-
Function Role: This role deals with specific tasks or operations during the conversation. While the System Role sets the AI’s overall behavior, the Function Role focuses on carrying out certain actions or commands the user asks for. It’s like a special feature in the AI, used when needed to perform specific functions such as calculations, fetching data, or other tasks beyond just talking. This role allows the AI to offer practical help in addition to conversational responses.
角色在Spring AI中表示为一个枚举,如下所示
Roles are represented as an enumeration in Spring AI as shown below
public enum MessageType {
USER("user"),
ASSISTANT("assistant"),
SYSTEM("system"),
FUNCTION("function");
private final String value;
MessageType(String value) {
this.value = value;
}
public String getValue() {
return value;
}
public static MessageType fromValue(String value) {
for (MessageType messageType : MessageType.values()) {
if (messageType.getValue().equals(value)) {
return messageType;
}
}
throw new IllegalArgumentException("Invalid MessageType value: " + value);
}
}
PromptTemplate
Spring AI中提示模板化的关键组件是`PromptTemplate`类。该类使用Terence Parr开发的StringTemplate引擎,用于构造和管理提示。`PromptTemplate`类旨在促进结构化提示的创建,然后将这些提示发送到人工智能模型进行处理
A key component for prompt templating in Spring AI is the PromptTemplate
class.
This class uses the StringTemplate engine, developed by Terence Parr, for constructing and managing prompts.
The PromptTemplate
class is designed to facilitate the creation of structured prompts that are then sent to the AI model for processing
public class PromptTemplate implements PromptTemplateActions, PromptTemplateMessageActions {
// Other methods to be discussed later
}
该类实现的接口支持提示创建的各个方面:
The interfaces implemented by this class support different aspects of prompt creation:
`PromptTemplateStringActions`专注于创建和呈现提示字符串,代表最基本的提示生成形式。
PromptTemplateStringActions
focuses on creating and rendering prompt strings, representing the most basic form of prompt generation.
`PromptTemplateMessageActions`专为通过生成和处理Message对象进行提示创建而设计。
PromptTemplateMessageActions
is tailored for prompt creation through the generation and manipulation of Message objects.
`PromptTemplateActions`旨在返回Prompt对象,该对象可以传递给ChatClient以生成响应。
PromptTemplateActions
is designed to return the Prompt object, which can be passed to ChatClient for generating a response.
虽然在许多项目中可能不会广泛使用这些界面,但它们展示了提示创建的不同方法。
While these interfaces might not be used extensively in many projects, they show the different approaches to prompt creation.
实现的接口
The implemented interfaces are
public interface PromptTemplateStringActions {
String render();
String render(Map<String, Object> model);
}
方法`String render()`:将提示模板呈现为最终字符串格式,无需外部输入,适用于没有占位符或动态内容的模板。
The method String render()
: Renders a prompt template into a final string format without external input, suitable for templates without placeholders or dynamic content.
方法`String render(Map<String, Object> model)`:增强渲染功能以包含动态内容。它使用Map<String, Object>,其中映射键是提示模板中的占位符名称,值是要插入的动态内容。
The method String render(Map<String, Object> model)
: Enhances rendering functionality to include dynamic content. It uses a Map<String, Object> where map keys are placeholder names in the prompt template, and values are the dynamic content to be inserted.
public interface PromptTemplateMessageActions {
Message createMessage();
Message createMessage(Map<String, Object> model);
}
方法`Message createMessage()`:创建一个不包含附加数据的消息对象,用于静态或预定义的消息内容。
The method Message createMessage()
: Creates a Message object without additional data, used for static or predefined message content.
方法`Message createMessage(Map<String, Object> model)`:扩展消息创建以集成动态内容,它接受一个Map<String, Object>,其中每个条目表示消息模板中的一个占位符及其对应的动态值。
The method Message createMessage(Map<String, Object> model)
: Extends message creation to integrate dynamic content, accepting a Map<String, Object> where each entry represents a placeholder in the message template and its corresponding dynamic value.
public interface PromptTemplateActions extends PromptTemplateStringActions {
Prompt create();
Prompt create(Map<String, Object> model);
}
方法`Prompt create()`:生成一个没有外部数据输入的Prompt对象,非常适合静态或预定义的提示。
The method Prompt create()
: Generates a Prompt object without external data inputs, ideal for static or predefined prompts.
方法`Prompt create(Map<String, Object> model)`:扩展提示创建能力以包含动态内容,它接受一个Map<String, Object>,其中每个映射条目是提示模板中的占位符及其关联的动态值。
The method Prompt create(Map<String, Object> model)
: Expands prompt creation capabilities to include dynamic content, taking a Map<String, Object> where each map entry is a placeholder in the prompt template and its associated dynamic value.
Example Usage
以下所示是取自 AI Workshop on PromptTemplates 的一个简单示例。
A simple example taken from the AI Workshop on PromptTemplates is shown below.
PromptTemplate promptTemplate = new PromptTemplate("Tell me a {adjective} joke about {topic}");
Prompt prompt = promptTemplate.create(Map.of("adjective", adjective, "topic", topic));
return chatClient.call(prompt).getResult();
下面展示了从 ` AI Workshop on Roles` 摘取的另一个示例。
Another example taken from the AI Workshop on Roles is shown below.
String userText = """
Tell me about three famous pirates from the Golden Age of Piracy and why they did.
Write at least a sentence for each pirate.
""";
Message userMessage = new UserMessage(userText);
String systemText = """
You are a helpful AI assistant that helps people find information.
Your name is {name}
You should reply to the user's request with your name and also in the style of a {voice}.
""";
SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(systemText);
Message systemMessage = systemPromptTemplate.createMessage(Map.of("name", name, "voice", voice));
Prompt prompt = new Prompt(List.of(userMessage, systemMessage));
List<Generation> response = chatClient.call(prompt).getResults();
此示例展示了如何使用 SystemPromptTemplate
来创建包含传递占位符值的系统角色的 Message
,以此构建 Prompt
实例。然后将具有角色 user
的消息与角色为 system
的消息组合起来,以形成提示。随后将提示传递给 ChatClient 以获得生成式响应。
This shows how you can build up the Prompt
instance by using the SystemPromptTemplate
to create a Message
with the system role passing in placeholder values.
The message with the role user
is then combined with the message of the role system
to form the prompt.
The prompt is then passed to the ChatClient to get a generative response.
Using resources instead of raw Strings
Spring AI 支持 org.springframework.core.io.Resource
抽象,因此,你可以在文件中放置可以直接在 PromptTemplates 中使用的提示数据。例如,可以在 Spring 托管组件中定义一个字段来检索 Resource。
Spring AI supports the org.springframework.core.io.Resource
abstraction so you can put prompt data in a file that can directly be used in PromptTemplates.
For example, you can define a field in your Spring managed component to retrieve the Resource.
@Value("classpath:/prompts/system-message.st")
private Resource systemResource;
然后将该资源直接传递给 SystemPromptTemplate
。
and then pass that resource to the SystemPromptTemplate
directly.
SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(systemResource);
Prompt Engineering
在生成式 AI 中,提示的创建是开发人员的一项关键任务。这些提示的质量和结构会极大地影响 AI 的输出效果。投入时间和精力来设计经过深思熟虑的提示可以极大地改善 AI 的结果。
In generative AI, the creation of prompts is a crucial task for developers. The quality and structure of these prompts significantly influence the effectiveness of the AI’s output. Investing time and effort in designing thoughtful prompts can greatly improve the results from the AI.
在 AI 社区中,分享和讨论提示是一种常见做法。这种协作方式不仅创建一个共享的学习环境,而且还导致识别和使用高效的提示。
Sharing and discussing prompts is a common practice in the AI community. This collaborative approach not only creates a shared learning environment but also leads to the identification and use of highly effective prompts.
这个领域的研究通常涉及分析和比较不同的提示,以评估它们在不同情况下的有效性。例如,一项重大的研究表明,以“深呼吸,一步一步解决这个问题”开头可以显著提高解决问题的效率。这突出了精心选择用语对生成式 AI 系统性能的影响。
Research in this area often involves analyzing and comparing different prompts to assess their effectiveness in various situations. For example, a significant study demonstrated that starting a prompt with "Take a deep breath and work on this problem step by step" significantly enhanced problem-solving efficiency. This highlights the impact that well-chosen language can have on generative AI systems' performance.
掌握最有效的使用提示的方法,尤其是在 AI 技术快速发展的情况下,是一个持续的挑战。你应该认识到提示工程的重要性,并考虑利用社区和研究的见解来改进提示创建策略。
Grasping the most effective use of prompts, particularly with the rapid advancement of AI technologies, is a continuous challenge. You should recognize the importance of prompt engineering and consider using insights from the community and research to improve prompt creation strategies.
Creating effective prompts
在开发提示时,集成几个关键组件以确保清晰性和有效性非常重要:
When developing prompts, it’s important to integrate several key components to ensure clarity and effectiveness:
-
Instructions: Offer clear and direct instructions to the AI, similar to how you would communicate with a person. This clarity is essential for helping the AI 'understand' what is expected.
-
External Context: Include relevant background information or specific guidance for the AI’s response when necessary. This 'external context' frames the prompt and aids the AI in grasping the overall scenario.
-
User Input: This is the straightforward part - the user’s direct request or question forming the core of the prompt.
-
Output Indicator: This aspect can be tricky. It involves specifying the desired format for the AI’s response, such as JSON. However, be aware that the AI might not always adhere strictly to this format. For instance, it might prepend a phrase like "here is your JSON" before the actual JSON data, or sometimes generate a JSON-like structure that is not accurate.
在编写提示时,向 AI 提供预期问题和答案格式的示例可能非常有益。这个做法能够帮助 AI“理解”查询的结构和意图,从而产生更准确和相关性的响应。虽然此文档并未深入研究这些技术,但它们为进一步探索 AI 提示工程提供了一个起点。
Providing the AI with examples of the anticipated question and answer format can be highly beneficial when crafting prompts. This practice helps the AI 'understand' the structure and intent of your query, leading to more precise and relevant responses. While this documentation does not delve deeply into these techniques, they provide a starting point for further exploration in AI prompt engineering.
以下列出了一些供进一步调查的资源。
Following is a list of resources for further investigation.
Simple Techniques
-
Text Summarization: Reduces extensive text into concise summaries, capturing key points and main ideas while omitting less critical details.
-
Question Answering: Focuses on deriving specific answers from provided text, based on user-posed questions. It’s about pinpointing and extracting relevant information in response to queries.
-
Text Classification: Systematically categorizes text into predefined categories or groups, analyzing the text and assigning it to the most fitting category based on its content.
-
Conversation: Creates interactive dialogues where the AI can engage in back-and-forth communication with users, simulating a natural conversation flow.
-
Code Generation: Generates functional code snippets based on specific user requirements or descriptions, translating natural language instructions into executable code.
Advanced Techniques
-
Zero-shot, Few-shot Learning: Enables the model to make accurate predictions or responses with minimal to no prior examples of the specific problem type, understanding and acting on new tasks using learned generalizations.
-
Chain-of-Thought: Links multiple AI responses to create a coherent and contextually aware conversation. It helps the AI maintain the thread of the discussion, ensuring relevance and continuity.
-
ReAct (Reason + Act): In this method, the AI first analyzes (reasons about) the input, then determines the most appropriate course of action or response. It combines understanding with decision-making.
Microsoft Guidance
-
Framework for Prompt Creation and Optimization: Microsoft offers a structured approach to developing and refining prompts. This framework guides users in creating effective prompts that elicit the desired responses from AI models, optimizing the interaction for clarity and efficiency.
Tokens
令牌在 AI 模型处理文本中至关重要,充当将单词(正如我们理解的那样)转换成 AI 模型可以处理的格式的桥梁。此转换分两个阶段进行:输入时将单词转换为令牌,然后在输出中将这些令牌转换回单词。
Tokens are essential in how AI models process text, acting as a bridge that converts words (as we understand them) into a format that AI models can process. This conversion occurs in two stages: words are transformed into tokens upon input, and these tokens are then converted back into words in the output.
标记化(将文本分解为标记的过程)是 AI 模型理解和处理语言的基础。AI 模型使用这种标记化格式来理解和响应提示。
Tokenization, the process of breaking down text into tokens, is fundamental to how AI models comprehend and process language. The AI model works with this tokenized format to understand and respond to prompts.
为了更好地理解令牌,可以将它们视为单词的一部分。通常,一个令牌代表大约四分之三个单词。例如,莎士比亚的全部作品大约有 900,000 个单词,转换为大约 120 万个令牌。
To better understand tokens, think of them as portions of words. Typically, a token represents about three-quarters of a word. For instance, the complete works of Shakespeare, totaling roughly 900,000 words, would translate to around 1.2 million tokens.
使用 OpenAI Tokenizer UI 来试验如何将单词转换为标记。
Experiment with the OpenAI Tokenizer UI to see how words are converted into tokens.
除了在 AI 处理中的技术作用外,令牌还具有实际意义,尤其是在计费和模型功能方面:
Tokens have practical implications beyond their technical role in AI processing, especially regarding billing and model capabilities:
-
Billing: AI model services often bill based on token usage. Both the input (prompt) and the output (response) contribute to the total token count, making shorter prompts more cost-effective.
-
Model Limits: Different AI models have varying token limits, defining their "context window" – the maximum amount of information they can process at a time. For example, GPT-3’s limit is 4K tokens, while other models like Claude 2 and Meta Llama 2 have limits of 100K tokens, and some research models can handle up to 1 million tokens.
-
Context Window: A model’s token limit determines its context window. Inputs exceeding this limit are not processed by the model. It’s crucial to send only the minimal effective set of information for processing. For example, when inquiring about "Hamlet," there’s no need to include tokens from all of Shakespeare’s other works.
-
Response Metadata: The metadata of a response from an AI model includes the number of tokens used, a vital piece of information for managing usage and costs.