VertexAI Gemini Chat

Vertex AI Gemini API允许开发人员使用 Gemini 模型构建生成式 AI 应用程序。Vertex AI Gemini API 支持多模态提示作为输入和输出文本或代码。多模态模型是一个模型,能够处理来自多个模态的信息,包括图像、视频和文本。例如,您可以向模型发送一盘饼干的照片,并要求它为您提供这些饼干的食谱。

The Vertex AI Gemini API allows developers to build generative AI applications using the Gemini model. The Vertex AI Gemini API supports multimodal prompts as input and output text or code. A multimodal model is a model that is capable of processing information from multiple modalities, including images, videos, and text. For example, you can send the model a photo of a plate of cookies and ask it to give you a recipe for those cookies.

Gemini 是由 Google DeepMind 开发的一系列生成式 AI 模型,专为多模态用例而设计。Gemini API 使您能够访问 Gemini 1.0 Pro Vision 和 Gemini 1.0 Pro 模型。有关 Vertex AI Gemini API 模型的规范,请参阅 Model information

Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases. The Gemini API gives you access to the Gemini 1.0 Pro Vision and Gemini 1.0 Pro models. For specifications of the Vertex AI Gemini API models, see Model information. Gemini API Reference

Prerequisites

设置 Java 开发环境。通过运行以下命令进行身份验证。将 PROJECT_ID 替换为 Google Cloud 项目 ID,将 ACCOUNT 替换为 Google Cloud 用户名。

Set up your Java Development Environment. Authenticate by running the following command. Replace PROJECT_ID with your Google Cloud project ID and ACCOUNT with your Google Cloud username.

gcloud config set project PROJECT_ID &&
gcloud auth application-default login ACCOUNT

Auto-configuration

Spring AI 为 VertexAI Gemini 聊天客户端提供了 Spring Boot 自动配置。要启用它,请向项目 Maven pom.xml 文件添加以下依赖项:

Spring AI provides Spring Boot auto-configuration for the VertexAI Gemini Chat Client. To enable it add the following dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-gemini-spring-boot-starter</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-vertex-ai-gemini-spring-boot-starter'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

Chat Properties

前缀 spring.ai.vertex.ai.gemini 用作允许您连接到 VertexAI 的属性前缀。

The prefix spring.ai.vertex.ai.gemini is used as the property prefix that lets you connect to VertexAI.

Property Description Default

spring.ai.vertex.ai.gemini.projectId

Google Cloud Platform project ID

-

spring.ai.vertex.ai.gemini.location

Region

-

spring.ai.vertex.ai.gemini.credentialsUri

URI to Vertex AI Gemini credentials. When provided it is used to create an a GoogleCredentials instance to authenticate the VertexAI.

-

前缀 spring.ai.vertex.ai.gemini.chat 是允许您配置用于 VertexAI Gemini 聊天的聊天客户端实现的属性前缀。

The prefix spring.ai.vertex.ai.gemini.chat is the property prefix that lets you configure the chat client implementation for VertexAI Gemini Chat.

Property Description Default

spring.ai.vertex.ai.gemini.chat.options.model

This is the Vertex AI Gemini Chat model to use

gemini-pro-vision

spring.ai.vertex.ai.gemini.chat.options.temperature

Controls the randomness of the output. Values can range over [0.0,1.0], inclusive. A value closer to 1.0 will produce responses that are more varied, while a value closer to 0.0 will typically result in less surprising responses from the generative. This value specifies default to be used by the backend while making the call to the generative.

0.8

spring.ai.vertex.ai.gemini.chat.options.topK

The maximum number of tokens to consider when sampling. The generative uses combined Top-k and nucleus sampling. Top-k sampling considers the set of topK most probable tokens.

-

spring.ai.vertex.ai.gemini.chat.options.topP

The maximum cumulative probability of tokens to consider when sampling. The generative uses combined Top-k and nucleus sampling. Nucleus sampling considers the smallest set of tokens whose probability sum is at least topP.

-

spring.ai.vertex.ai.gemini.chat.options.candidateCount

The number of generated response messages to return. This value must be between [1, 8], inclusive. Defaults to 1.

-

spring.ai.vertex.ai.gemini.chat.options.candidateCount

The number of generated response messages to return. This value must be between [1, 8], inclusive. Defaults to 1.

-

spring.ai.vertex.ai.gemini.chat.options.maxOutputTokens

The maximum number of tokens to generate.

-

spring.ai.vertex.ai.gemini.chat.options.frequencyPenalty

-

spring.ai.vertex.ai.gemini.chat.options.presencePenalty

-

spring.ai.vertex.ai.gemini.chat.options.functions

List of functions, identified by their names, to enable for function calling in a single prompt requests. Functions with those names must exist in the functionCallbacks registry.

-

带有 spring.ai.vertex.ai.gemini.chat.options 前缀的所有属性都可以通过将特定请求 Runtime options 添加到 Prompt 调用在运行时进行覆盖。

All properties prefixed with spring.ai.vertex.ai.gemini.chat.options can be overridden at runtime by adding a request specific Runtime options to the Prompt call.

Runtime options

VertexAiGeminiChatOptions.java提供模型配置,例如温度、前 K 个等。

The VertexAiGeminiChatOptions.java provides model configurations, such as the temperature, the topK, etc.

在启动时,可以使用 VertexAiGeminiChatClient(api, options) 构造函数或 spring.ai.vertex.ai.chat.options.* 属性配置默认选项。

On start-up, the default options can be configured with the VertexAiGeminiChatClient(api, options) constructor or the spring.ai.vertex.ai.chat.options.* properties.

在运行时,可以通过向 Prompt 调用添加新的、特定于请求的选项来覆盖默认选项。例如,要覆盖特定请求的默认温度:

At runtime you can override the default options by adding new, request specific, options to the Prompt call. For example to override the default temperature for a specific request:

ChatResponse response = chatClient.call(
    new Prompt(
        "Generate the names of 5 famous pirates.",
        VertexAiPaLm2ChatOptions.builder()
            .withTemperature(0.4)
        .build()
    ));

除了特定 VertexAiChatPaLm2Options 的模型之外,您还可以使用使用 ChatOptionsBuilder#builder() 创建的可移植 ChatOptions 实例。

In addition to the model specific VertexAiChatPaLm2Options you can use a portable ChatOptions instance, created with the ChatOptionsBuilder#builder().

Function Calling

你可以使用 VertexAiGeminiChatClient 注册自定义 Java 函数,并让 Gemini Pro 模型智能地选择输出一个 JSON 对象,该对象包含用于调用一个或多个已注册函数的参数。这是一种将 LLM 功能与外部工具和 API 结合起来的强大技术。阅读更多有关 Vertex AI Gemini Function Calling 的信息。

You can register custom Java functions with the VertexAiGeminiChatClient and have the Gemini Pro model intelligently choose to output a JSON object containing arguments to call one or many of the registered functions. This is a powerful technique to connect the LLM capabilities with external tools and APIs. Read more about Vertex AI Gemini Function Calling.

Multimodal

多模态是指模型同时理解和处理来自各种来源(包括文本、图像、音频和其他数据格式)的信息的能力。此范例表示 AI 模型的重大进步。

Multimodality refers to a model’s ability to simultaneously understand and process information from various sources, including text, images, audio, and other data formats. This paradigm represents a significant advancement in AI models.

Google 的 Gemini AI 模型通过理解和集成文本、代码、音频、图像和视频来支持此功能。有关更多详细信息,请参阅博客文章 Introducing Gemini

Google’s Gemini AI models support this capability by comprehending and integrating text, code, audio, images, and video. For more details, refer to the blog post Introducing Gemini.

Spring AI 的 Message 接口通过引入媒体类型来支持多模态 AI 模型。此类型包含消息中媒体附件的数据和信息,使用 Spring 的 org.springframework.util.MimeTypejava.lang.Object 作为原始媒体数据。

Spring AI’s Message interface supports multimodal AI models by introducing the Media type. This type contains data and information about media attachments in messages, using Spring’s org.springframework.util.MimeType and a java.lang.Object for the raw media data.

下面是从 VertexAiGeminiChatClientIT.java中提取的一个简单的代码示例,展示了用户文本与图像的组合。

Below is a simple code example extracted from VertexAiGeminiChatClientIT.java, demonstrating the combination of user text with an image.

byte[] data = new ClassPathResource("/vertex-test.png").getContentAsByteArray();

var userMessage = new UserMessage("Explain what do you see o this picture?",
        List.of(new Media(MimeTypeUtils.IMAGE_PNG, data)));

ChatResponse response = chatClient.call(new Prompt(List.of(userMessage)));

Sample Controller

Create 一个新的 Spring Boot 项目,并将 spring-ai-vertex-ai-palm2-spring-boot-starter 添加到你的 pom(或 gradle)依赖项。

Create a new Spring Boot project and add the spring-ai-vertex-ai-palm2-spring-boot-starter to your pom (or gradle) dependencies.

src/main/resources 目录下添加一个 application.properties 文件,以启用并配置 VertexAi Chat 客户端:

Add a application.properties file, under the src/main/resources directory, to enable and configure the VertexAi Chat client:

spring.ai.vertex.ai.gemini.project-id=PROJECT_ID
spring.ai.vertex.ai.gemini.location=LOCATION
spring.ai.vertex.ai.gemini.chat.options.model=vertex-pro-vision
spring.ai.vertex.ai.gemini.chat.options.temperature=0.5

api-key 替换为你的 VertexAI 凭据。

replace the api-key with your VertexAI credentials.

这将创建一个 VertexAiGeminiChatClient 实现,您可以将其注入您的类中。这里有一个简单的 @Controller 类的示例,它使用聊天客户端进行文本生成。

This will create a VertexAiGeminiChatClient implementation that you can inject into your class. Here is an example of a simple @Controller class that uses the chat client for text generations.

@RestController
public class ChatController {

    private final VertexAiGeminiChatClient chatClient;

    @Autowired
    public ChatController(VertexAiGeminiChatClient chatClient) {
        this.chatClient = chatClient;
    }

    @GetMapping("/ai/generate")
    public Map generate(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        return Map.of("generation", chatClient.call(message));
    }

    @GetMapping("/ai/generateStream")
	public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        Prompt prompt = new Prompt(new UserMessage(message));
        return chatClient.stream(prompt);
    }
}

Manual Configuration

VertexAiGeminiChatClient实现了 `ChatClient`并使用 `VertexAI`连接到 Vertex AI Gemini 服务。

The VertexAiGeminiChatClient implements the ChatClient and uses the VertexAI to connect to the Vertex AI Gemini service.

向项目 Maven pom.xml 文件中添加 spring-ai-vertex-ai-gemini 依赖项:

Add the spring-ai-vertex-ai-gemini dependency to your project’s Maven pom.xml file:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-vertex-ai-gemini</artifactId>
</dependency>

或添加到 Gradle build.gradle 构建文件中。

or to your Gradle build.gradle build file.

dependencies {
    implementation 'org.springframework.ai:spring-ai-vertex-ai-gemini'
}
  1. 参见 Dependency Management 部分,将 Spring AI BOM 添加到你的构建文件中。

Refer to the Dependency Management section to add the Spring AI BOM to your build file.

接下来,创建一个 VertexAiGeminiChatClient 并使用它进行文本生成:

Next, create a VertexAiGeminiChatClient and use it for text generations:

VertexAI vertexApi =  new VertexAI(projectId, location);

var chatClient = new VertexAiGeminiChatClient(vertexApi,
    VertexAiGeminiChatOptions.builder()
        .withTemperature(0.4)
    .build());

ChatResponse response = chatClient.call(
    new Prompt("Generate the names of 5 famous pirates."));

VertexAiGeminiChatOptions 提供了聊天请求的配置信息。VertexAiGeminiChatOptions.Builder 是流畅的选项构建器。

The VertexAiGeminiChatOptions provides the configuration information for the chat requests. The VertexAiGeminiChatOptions.Builder is fluent options builder.