OpenAI Transcriptions
OpenAI Transcriptions
Spring AI 支持 OpenAI’s Transcription model。
Spring AI supports OpenAI’s Transcription model.
Prerequisites
您需要使用 OpenAI 创建 API 密钥来访问 ChatGPT 模型。在 OpenAI signup page上创建一个帐户并在 API Keys page上生成令牌。Spring AI 项目定义了一个配置属性,名为 spring.ai.openai.api-key
,您应将其设置为从 openai.com 获得的 `API Key`的值。导出环境变量是设置该配置属性的一种方法:
You will need to create an API key with OpenAI to access ChatGPT models.
Create an account at OpenAI signup page and generate the token on the API Keys page.
The Spring AI project defines a configuration property named spring.ai.openai.api-key
that you should set to the value of the API Key
obtained from openai.com.
Exporting an environment variable is one way to set that configuration property:
Auto-configuration
Spring AI 提供适用于 OpenAI 图像生成客户端的 Spring Boot 自动配置。要启用它,请将以下依赖项添加到项目的 Maven pom.xml
文件中:
Spring AI provides Spring Boot auto-configuration for the OpenAI Image Generation Client.
To enable it add the following dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-openai-spring-boot-starter'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
Transcription Properties
前缀 spring.ai.openai.audio.transcription
用作属性前缀,允许您配置 OpenAI Image 客户端的重试机制。
The prefix spring.ai.openai.audio.transcription
is used as the property prefix that lets you configure the retry mechanism for the OpenAI Image client.
Property | Description | Default |
---|---|---|
spring.ai.openai.audio.transcription.options.model |
ID of the model to use. Only whisper-1 (which is powered by our open source Whisper V2 model) is currently available. |
whisper-1 |
spring.ai.openai.audio.transcription.options.response-format |
The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt. |
json |
spring.ai.openai.audio.transcription.options.prompt |
An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language. |
|
spring.ai.openai.audio.transcription.options.language |
The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency. |
|
spring.ai.openai.audio.transcription.options.temperature |
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. |
0 |
spring.ai.openai.audio.transcription.options.timestamp_granularities |
The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: word, or segment. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. |
segment |
Runtime Options
OpenAiAudioTranscriptionOptions
类提供了在制作转录文时可用的选项。在启动时,使用 spring.ai.openai.audio.transcription
指定的选项,但您可以在运行时覆盖这些选项。
The OpenAiAudioTranscriptionOptions
class provides the options to use when making a transcription.
On start-up, the options specified by spring.ai.openai.audio.transcription
are used but you can override these at runtime.
例如:
For example:
OpenAiAudioApi.TranscriptResponseFormat responseFormat = OpenAiAudioApi.TranscriptResponseFormat.VTT;
OpenAiAudioTranscriptionOptions transcriptionOptions = OpenAiAudioTranscriptionOptions.builder()
.withLanguage("en")
.withPrompt("Ask not this, but ask that")
.withTemperature(0f)
.withResponseFormat(responseFormat)
.build();
AudioTranscriptionPrompt transcriptionRequest = new AudioTranscriptionPrompt(audioFile, transcriptionOptions);
AudioTranscriptionResponse response = openAiTranscriptionClient.call(transcriptionRequest);
Manual Configuration
添加 spring-ai-openai
依赖到你的项目的 Maven pom.xml
文件中:
Add the spring-ai-openai
dependency to your project’s Maven pom.xml
file:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai</artifactId>
</dependency>
或添加到 Gradle build.gradle
构建文件中。
or to your Gradle build.gradle
build file.
dependencies {
implementation 'org.springframework.ai:spring-ai-openai'
}
|
Refer to the Dependency Management section to add the Spring AI BOM to your build file. |
接下来,创建一个 OpenAiAudioTranscriptionClient
Next, create a OpenAiAudioTranscriptionClient
var openAiAudioApi = new OpenAiAudioApi(System.getenv("OPENAI_API_KEY"));
var openAiAudioTranscriptionClient = new OpenAiAudioTranscriptionClient(openAiAudioApi);
var transcriptionOptions = OpenAiAudioTranscriptionOptions.builder()
.withResponseFormat(TranscriptResponseFormat.TEXT)
.withTemperature(0f)
.build();
var audioFile = new FileSystemResource("/path/to/your/resource/speech/jfk.flac");
AudioTranscriptionPrompt transcriptionRequest = new AudioTranscriptionPrompt(audioFile, transcriptionOptions);
AudioTranscriptionResponse response = openAiTranscriptionClient.call(transcriptionRequest);
Example Code
-
The OpenAiTranscriptionClientIT.java test provides some general examples how to use the library.