Image Model API

Spring Image Model API 旨在成为一个简单且可移植的接口,用于与各种专门从事图像生成的 AI Models 进行交互,允许开发人员以最少的代码更改在不同的图像相关模型之间切换。此设计与 Spring 的模块化和互换性理念保持一致,确保开发人员可以快速使其应用程序适应与图像处理相关的不同 AI 功能。

The Spring Image Model API is designed to be a simple and portable interface for interacting with various AI Models specialized in image generation, allowing developers to switch between different image-related models with minimal code changes. This design aligns with Spring’s philosophy of modularity and interchangeability, ensuring developers can quickly adapt their applications to different AI capabilities related to image processing.

此外,在 ImagePrompt 用于输入封装和 ImageResponse 用于输出处理等配套类的支持下,图像模型 API 统一了与专注于图像生成的 AI 模型的通信。它管理请求准备和响应解析的复杂性,为图像生成功能提供直接简化的 API 交互。

Additionally, with the support of companion classes like ImagePrompt for input encapsulation and ImageResponse for output handling, the Image Model API unifies the communication with AI Models dedicated to image generation. It manages the complexity of request preparation and response parsing, offering a direct and simplified API interaction for image-generation functionalities.

Spring 图像模型 API 构建在 Spring AI Generic Model API 之上,提供特定于图像的抽象和实现。

The Spring Image Model API is built on top of the Spring AI Generic Model API, providing image-specific abstractions and implementations.

API Overview

本节提供 Spring 图像模型 API 接口和相关类的指南。

This section provides a guide to the Spring Image Model API interface and associated classes.

Image Model

以下是 ImageModel 接口定义:

Here is the ImageModel interface definition:

@FunctionalInterface
public interface ImageModel extends Model<ImagePrompt, ImageResponse> {

	ImageResponse call(ImagePrompt request);

}

ImagePrompt

ImagePrompt 是一个 ModelRequest ,它封装了 ImageMessage 对象的列表和可选的模型请求选项。以下列表显示了 ImagePrompt 类的截断版本,不包括构造函数和其他实用方法:

The ImagePrompt is a ModelRequest that encapsulates a list of ImageMessage objects and optional model request options. The following listing shows a truncated version of the ImagePrompt class, excluding constructors and other utility methods:

public class ImagePrompt implements ModelRequest<List<ImageMessage>> {

    private final List<ImageMessage> messages;

	private ImageOptions imageModelOptions;

    @Override
	public List<ImageMessage> getInstructions() {...}

	@Override
	public ImageOptions getOptions() {...}

    // constructors and utility methods omitted
}

ImageMessage

ImageMessage 类封装要使用的文本以及文本在影响生成的图像中应该具有的权重。对于支持权重的模型,它们可以是正数或负数。

The ImageMessage class encapsulates the text to use and the weight that the text should have in influencing the generated image. For models that support weights, they can be positive or negative.

public class ImageMessage {

	private String text;

	private Float weight;

    public String getText() {...}

	public Float getWeight() {...}

   // constructors and utility methods omitted
}

ImageOptions

表示可以传递给图像生成模型的选项。 ImageOptions 接口扩展了 ModelOptions 接口,用于定义可以传递给 AI 模型的少量可移植选项。

Represents the options that can be passed to the Image generation model. The ImageOptions interface extends the ModelOptions interface and is used to define few portable options that can be passed to the AI model.

ImageOptions 接口定义如下:

The ImageOptions interface is defined as follows:

public interface ImageOptions extends ModelOptions {

	Integer getN();

	String getModel();

	Integer getWidth();

	Integer getHeight();

	String getResponseFormat(); // openai - url or base64 : stability ai byte[] or base64

}

此外,每个特定于模型的 ImageModel 实现都可以有自己的选项,可以传递给 AI 模型。例如,OpenAI 图像生成模型有自己的选项,如 qualitystyle 等。

Additionally, every model specific ImageModel implementation can have its own options that can be passed to the AI model. For example, the OpenAI Image Generation model has its own options like quality, style, etc.

这是一个强大的功能,允许开发人员在启动应用程序时使用特定于模型的选项,然后使用 ImagePrompt 在运行时覆盖它们。

This is a powerful feature that allows developers to use model specific options when starting the application and then override them at runtime using the ImagePrompt.

ImageResponse

ImageResponse 类的结构如下:

The structure of the ImageResponse class is as follows:

public class ImageResponse implements ModelResponse<ImageGeneration> {

	private final ImageResponseMetadata imageResponseMetadata;

	private final List<ImageGeneration> imageGenerations;

	@Override
	public ImageGeneration getResult() {
		// get the first result
	}

	@Override
	public List<ImageGeneration> getResults() {...}

	@Override
	public ImageResponseMetadata getMetadata() {...}

    // other methods omitted

}

ImageResponse 类保存 AI 模型的输出,每个 ImageGeneration 实例包含一个可能来自单个提示的多个输出之一。

The ImageResponse class holds the AI Model’s output, with each ImageGeneration instance containing one of potentially multiple outputs resulting from a single prompt.

ImageResponse 类还带有一个 ImageResponseMetadata 对象,其中包含有关 AI 模型响应的元数据。

The ImageResponse class also carries a ImageResponseMetadata object holding metadata about the AI Model’s response.

ImageGeneration

最后, ImageGeneration 类继承自 ModelResult ,以表示输出响应和有关此结果的相关元数据:

Finally, the ImageGeneration class extends from the ModelResult to represent the output response and related metadata about this result:

public class ImageGeneration implements ModelResult<Image> {

	private ImageGenerationMetadata imageGenerationMetadata;

	private Image image;

    @Override
	public Image getOutput() {...}

	@Override
	public ImageGenerationMetadata getMetadata() {...}

    // other methods omitted

}

Available Implementations

ImageModel 实现提供给以下模型提供商:

ImageModel implementations are provided for the following Model providers:

API Docs

您可以找到 Javadoc here

You can find the Javadoc here.

Feedback and Contributions

该项目的 GitHub discussions 是发送反馈的一个好地方。

The project’s GitHub discussions is a great place to send feedback.