Single Step Batch Job Starter

使用该文档提供的 starter,你可以通过配置来创建具有一个步骤的 Spring Batch Job。此 starter 提供自动配置,用于配置支持的 ItemReader、ItemWriter 或完整的单步 Spring Batch Job。

本节介绍如何使用 Spring Cloud Task 中包含的启动程序开发一个具有单个 Step 的 Spring Batch Job 。此启动程序允许您使用配置来定义 ItemReaderItemWriter 或一个完整的单步 Spring Batch Job 。有关 Spring Batch 及其功能的更多详细信息,请参阅 Spring Batch documentation

This section goes into how to develop a Spring Batch Job with a single Step by using the starter included in Spring Cloud Task. This starter lets you use configuration to define an ItemReader, an ItemWriter, or a full single-step Spring Batch Job. For more about Spring Batch and its capabilities, see the Spring Batch documentation.

要获取适用于 Maven 的 starter,请将以下内容添加到构建中:

To obtain the starter for Maven, add the following to your build:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-single-step-batch-job</artifactId>
    <version>2.3.0</version>
</dependency>

要获取适用于 Gradle 的 starter,请将以下内容添加到构建中:

To obtain the starter for Gradle, add the following to your build:

compile "org.springframework.cloud:spring-cloud-starter-single-step-batch-job:2.3.0"

Defining a Job

你可以使用 starter 来尽可能少地定义 ItemReaderItemWriter,或尽可能多地定义 Job。在本节中,我们定义配置 Job 所需定义的属性。

You can use the starter to define as little as an ItemReader or an ItemWriter or as much as a full Job. In this section, we define which properties are required to be defined to configure a Job.

Properties

首先,starter 提供一组属性,让你可以配置具有一个步骤的基本作业:

To begin, the starter provides a set of properties that let you configure the basics of a Job with one Step:

Table 1. Job Properties
Property Type Default Value Description

spring.batch.job.jobName

String

null

The name of the job.

spring.batch.job.stepName

String

null

The name of the step.

spring.batch.job.chunkSize

Integer

null

The number of items to be processed per transaction.

配置上述属性后,你将获得一个具有一个基于块的步骤的作业。此基于块的步骤将读取、处理和写入 Map<String, Object> 实例作为项目。但是,该步骤尚未执行任何操作。你需要配置一个 ItemReader、一个可选 ItemProcessor 和一个 ItemWriter,以便提供要执行的操作。要配置其中一个,你可以使用属性并配置具有提供的自动配置的选项之一,也可以使用标准 Spring 配置机制配置你自己的选项。

With the above properties configured, you have a job with a single, chunk-based step. This chunk-based step reads, processes, and writes Map<String, Object> instances as the items. However, the step does not yet do anything. You need to configure an ItemReader, an optional ItemProcessor, and an ItemWriter to give it something to do. To configure one of these, you can either use properties and configure one of the options that has provided autoconfiguration or you can configure your own with the standard Spring configuration mechanisms.

如果你配置自己的属性,输入和输出类型必须与步骤中的其他属性匹配。此 starter 中的 ItemReader 实现和 ItemWriter 实现都将 Map<String, Object> 用作输入和输出项。

If you configure your own, the input and output types must match the others in the step. The ItemReader implementations and ItemWriter implementations in this starter all use a Map<String, Object> as the input and the output item.

Autoconfiguration for ItemReader Implementations

此 starter 为四个不同的 ItemReader 实现(AmqpItemReaderFlatFileItemReaderJdbcCursorItemReaderKafkaItemReader)提供自动配置。在本节中,我们概述了如何使用提供的自动配置配置每一个实现。

This starter provides autoconfiguration for four different ItemReader implementations: AmqpItemReader, FlatFileItemReader, JdbcCursorItemReader, and KafkaItemReader. In this section, we outline how to configure each of these by using the provided autoconfiguration.

AmqpItemReader

您可以使用 AmqpItemReader 来从队列或主题读取。此 ItemReader 实现的自动配置依赖于两组配置。首先是 AmqpTemplate 的配置。您可以自己配置此项,也可以使用 Spring Boot 提供的自动配置。请参阅 Spring Boot AMQP documentation 。配置好 AmqpTemplate 后,您可以通过设置以下属性来启用批处理功能对其提供支持:

You can read from a queue or topic with AMQP by using the AmqpItemReader. The autoconfiguration for this ItemReader implementation is dependent upon two sets of configuration. The first is the configuration of an AmqpTemplate. You can either configure this yourself or use the autoconfiguration provided by Spring Boot. See the Spring Boot AMQP documentation. Once you have configured the AmqpTemplate, you can enable the batch capabilities to support it by setting the following properties:

Table 2. AmqpItemReader Properties
Property Type Default Value Description

spring.batch.job.amqpitemreader.enabled

boolean

false

If true, the autoconfiguration will execute.

spring.batch.job.amqpitemreader.jsonConverterEnabled

boolean

true

Indicates if the Jackson2JsonMessageConverter should be registered to parse messages.

有关更多信息,请参阅 AmqpItemReader documentation

For more information, see the AmqpItemReader documentation.

FlatFileItemReader

FlatFileItemReader 让你可以从平面文件(例如 CSV 和其他文件格式)中读取。要从文件中读取,你可以通过常规 Spring 配置提供一些组件(LineTokenizerRecordSeparatorPolicyFieldSetMapperLineMapperSkippedLinesCallback)。你还可以使用以下属性配置读取器:

FlatFileItemReader lets you read from flat files (such as CSVs and other file formats). To read from a file, you can provide some components yourself through normal Spring configuration (LineTokenizer, RecordSeparatorPolicy, FieldSetMapper, LineMapper, or SkippedLinesCallback). You can also use the following properties to configure the reader:

Table 3. FlatFileItemReader Properties
Property Type Default Value Description

spring.batch.job.flatfileitemreader.saveState

boolean

true

Determines if the state should be saved for restarts.

spring.batch.job.flatfileitemreader.name

String

null

Name used to provide unique keys in the ExecutionContext.

spring.batch.job.flatfileitemreader.maxItemcount

int

Integer.MAX_VALUE

Maximum number of items to be read from the file.

spring.batch.job.flatfileitemreader.currentItemCount

int

0

Number of items that have already been read. Used on restarts.

spring.batch.job.flatfileitemreader.comments

List<String>

empty List

A list of Strings that indicate commented lines (lines to be ignored) in the file.

spring.batch.job.flatfileitemreader.resource

Resource

null

The resource to be read.

spring.batch.job.flatfileitemreader.strict

boolean

true

If set to true, the reader throws an exception if the resource is not found.

spring.batch.job.flatfileitemreader.encoding

String

FlatFileItemReader.DEFAULT_CHARSET

Encoding to be used when reading the file.

spring.batch.job.flatfileitemreader.linesToSkip

int

0

Indicates the number of lines to skip at the start of a file.

spring.batch.job.flatfileitemreader.delimited

boolean

false

Indicates whether the file is a delimited file (CSV and other formats). Only one of this property or spring.batch.job.flatfileitemreader.fixedLength can be true at the same time.

spring.batch.job.flatfileitemreader.delimiter

String

DelimitedLineTokenizer.DELIMITER_COMMA

If reading a delimited file, indicates the delimiter to parse on.

spring.batch.job.flatfileitemreader.quoteCharacter

char

DelimitedLineTokenizer.DEFAULT_QUOTE_CHARACTER

Used to determine the character used to quote values.

spring.batch.job.flatfileitemreader.includedFields

List<Integer>

empty list

A list of indices to determine which fields in a record to include in the item.

spring.batch.job.flatfileitemreader.fixedLength

boolean

false

Indicates if a file’s records are parsed by column numbers. Only one of this property or spring.batch.job.flatfileitemreader.delimited can be true at the same time.

spring.batch.job.flatfileitemreader.ranges

List<Range>

empty list

List of column ranges by which to parse a fixed width record. See the Range documentation.

spring.batch.job.flatfileitemreader.names

String []

null

List of names for each field parsed from a record. These names are the keys in the Map<String, Object> in the items returned from this ItemReader.

spring.batch.job.flatfileitemreader.parsingStrict

boolean

true

If set to true, the mapping fails if the fields cannot be mapped.

JdbcCursorItemReader

JdbcCursorItemReader 对关系数据库运行查询,并遍历结果游标 (ResultSet) 提供结果项目。通过此自动配置,你可以提供一个 PreparedStatementSetter、一个 RowMapper 或两者。你还可以使用以下属性配置 JdbcCursorItemReader

The JdbcCursorItemReader runs a query against a relational database and iterates over the resulting cursor (ResultSet) to provide the resulting items. This autoconfiguration lets you provide a PreparedStatementSetter, a RowMapper, or both. You can also use the following properties to configure a JdbcCursorItemReader:

Table 4. JdbcCursorItemReader Properties
Property Type Default Value Description

spring.batch.job.jdbccursoritemreader.saveState

boolean

true

Determines whether the state should be saved for restarts.

spring.batch.job.jdbccursoritemreader.name

String

null

Name used to provide unique keys in the ExecutionContext.

spring.batch.job.jdbccursoritemreader.maxItemcount

int

Integer.MAX_VALUE

Maximum number of items to be read from the file.

spring.batch.job.jdbccursoritemreader.currentItemCount

int

0

Number of items that have already been read. Used on restarts.

spring.batch.job.jdbccursoritemreader.fetchSize

int

A hint to the driver to indicate how many records to retrieve per call to the database system. For best performance, you usually want to set it to match the chunk size.

spring.batch.job.jdbccursoritemreader.maxRows

int

Maximum number of items to read from the database.

spring.batch.job.jdbccursoritemreader.queryTimeout

int

Number of milliseconds for the query to timeout.

spring.batch.job.jdbccursoritemreader.ignoreWarnings

boolean

true

Determines whether the reader should ignore SQL warnings when processing.

spring.batch.job.jdbccursoritemreader.verifyCursorPosition

boolean

true

Indicates whether the cursor’s position should be verified after each read to verify that the RowMapper did not advance the cursor.

spring.batch.job.jdbccursoritemreader.driverSupportsAbsolute

boolean

false

Indicates whether the driver supports absolute positioning of a cursor.

spring.batch.job.jdbccursoritemreader.useSharedExtendedConnection

boolean

false

Indicates whether the connection is shared with other processing (and is therefore part of a transaction).

spring.batch.job.jdbccursoritemreader.sql

String

null

SQL query from which to read.

你还可以通过使用以下属性明确指定读取器的 JDBC 数据源:JdbcCursorItemReader 属性

You can also specify JDBC DataSource specifically for the reader by using the following properties: .JdbcCursorItemReader Properties

Property Type Default Value Description

spring.batch.job.jdbccursoritemreader.datasource.enable

boolean

false

Determines whether JdbcCursorItemReader DataSource should be enabled.

jdbccursoritemreader.datasource.url

String

null

JDBC URL of the database.

jdbccursoritemreader.datasource.username

String

null

Login username of the database.

jdbccursoritemreader.datasource.password

String

null

Login password of the database.

jdbccursoritemreader.datasource.driver-class-name

String

null

Fully qualified name of the JDBC driver.

JDBCCursorItemReader 将使用默认 DataSource,如果未指定 jdbccursoritemreader_datasource

The default DataSource will be used by the JDBCCursorItemReader if the jdbccursoritemreader_datasource is not specified.

KafkaItemReader

从 Kafka 主题中提取数据分区非常有用,这正是 KafkaItemReader 所能做到的。要配置 KafkaItemReader,需要两部分配置。首先,需要使用 Spring Boot 的 Kafka 自动配置对 Kafka 进行配置(请参见 Spring Boot Kafka documentation)。一旦您通过 Spring Boot 配置了 Kafka 属性,就可以通过设置以下属性来配置 KafkaItemReader 本身:

Ingesting a partition of data from a Kafka topic is useful and exactly what the KafkaItemReader can do. To configure a KafkaItemReader, two pieces of configuration are required. First, configuring Kafka with Spring Boot’s Kafka autoconfiguration is required (see the Spring Boot Kafka documentation). Once you have configured the Kafka properties from Spring Boot, you can configure the KafkaItemReader itself by setting the following properties:

Table 5. KafkaItemReader Properties
Property Type Default Value Description

spring.batch.job.kafkaitemreader.name

String

null

Name used to provide unique keys in the ExecutionContext.

spring.batch.job.kafkaitemreader.topic

String

null

Name of the topic from which to read.

spring.batch.job.kafkaitemreader.partitions

List<Integer>

empty list

List of partition indices from which to read.

spring.batch.job.kafkaitemreader.pollTimeOutInSeconds

long

30

Timeout for the poll() operations.

spring.batch.job.kafkaitemreader.saveState

boolean

true

Determines whether the state should be saved for restarts.

Native Compilation

单步批处理的优点在于,它允许你在运行时动态选择要使用的读取和写入 Bean,当你使用 JVM 的时候。然而,当你使用本机编译时,你必须在构建时间而不是运行时确定读取和写入。以下示例演示了这一点:

The advantage of Single Step Batch Processing is that it lets you dynamically select which reader and writer beans to use at runtime when you use the JVM. However, when you use native compilation, you must determine the reader and writer at build time instead of runtime. The following example does so:

<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <executions>
        <execution>
            <id>process-aot</id>
            <goals>
                <goal>process-aot</goal>
            </goals>
            <configuration>
                <jvmArguments>
                    -Dspring.batch.job.flatfileitemreader.name=fooReader
                    -Dspring.batch.job.flatfileitemwriter.name=fooWriter
                </jvmArguments>
            </configuration>
        </execution>
    </executions>
</plugin>

ItemProcessor Configuration

单步批处理作业自动配置接受一个 ItemProcessor(如果一个在 ApplicationContext 中可用)。如果找到正确的类型(ItemProcessor<Map<String, Object>, Map<String, Object>>),它将自动接线到该步骤。

The single-step batch job autoconfiguration accepts an ItemProcessor if one is available within the ApplicationContext. If one is found of the correct type (ItemProcessor<Map<String, Object>, Map<String, Object>>), it is autowired into the step.

Autoconfiguration for ItemWriter implementations

此启动器为与支持的 ItemReader 实现相匹配的 ItemWriter 实现提供自动配置:AmqpItemWriterFlatFileItemWriterJdbcItemWriterKafkaItemWriter。本节介绍如何使用自动配置配置受支持的 ItemWriter

This starter provides autoconfiguration for ItemWriter implementations that match the supported ItemReader implementations: AmqpItemWriter, FlatFileItemWriter, JdbcItemWriter, and KafkaItemWriter. This section covers how to use autoconfiguration to configure a supported ItemWriter.

AmqpItemWriter

要写入 RabbitMQ 队列,您需要两组配置。首先,您需要一个 AmqpTemplate。获取此配置的最简单方法是使用 Spring Boot 的 RabbitMQ 自动配置。请参见 Spring Boot AMQP documentation

To write to a RabbitMQ queue, you need two sets of configuration. First, you need an AmqpTemplate. The easiest way to get this is by using Spring Boot’s RabbitMQ autoconfiguration. See the Spring Boot AMQP documentation.

一旦你配置了 AmqpTemplate,你就可以通过设置以下属性来配置 AmqpItemWriter

Once you have configured the AmqpTemplate, you can configure the AmqpItemWriter by setting the following properties:

Table 6. AmqpItemWriter Properties
Property Type Default Value Description

spring.batch.job.amqpitemwriter.enabled

boolean

false

If true, the autoconfiguration runs.

spring.batch.job.amqpitemwriter.jsonConverterEnabled

boolean

true

Indicates whether Jackson2JsonMessageConverter should be registered to convert messages.

FlatFileItemWriter

要将文件作为步骤的输出写入,你可以配置 FlatFileItemWriter。自动配置接受已经显式配置的组件(如 LineAggregatorFieldExtractorFlatFileHeaderCallbackFlatFileFooterCallback)以及通过设置指定以下属性配置的组件:

To write a file as the output of the step, you can configure FlatFileItemWriter. Autoconfiguration accepts components that have been explicitly configured (such as LineAggregator, FieldExtractor, FlatFileHeaderCallback, or a FlatFileFooterCallback) and components that have been configured by setting the following properties specified:

Table 7. FlatFileItemWriter Properties
Property Type Default Value Description

spring.batch.job.flatfileitemwriter.resource

Resource

null

The resource to be read.

spring.batch.job.flatfileitemwriter.delimited

boolean

false

Indicates whether the output file is a delimited file. If true, spring.batch.job.flatfileitemwriter.formatted must be false.

spring.batch.job.flatfileitemwriter.formatted

boolean

false

Indicates whether the output file a formatted file. If true, spring.batch.job.flatfileitemwriter.delimited must be false.

spring.batch.job.flatfileitemwriter.format

String

null

The format used to generate the output for a formatted file. The formatting is performed by using String.format.

spring.batch.job.flatfileitemwriter.locale

Locale

Locale.getDefault()

The Locale to be used when generating the file.

spring.batch.job.flatfileitemwriter.maximumLength

int

0

Max length of the record. If 0, the size is unbounded.

spring.batch.job.flatfileitemwriter.minimumLength

int

0

The minimum record length.

spring.batch.job.flatfileitemwriter.delimiter

String

,

The String used to delimit fields in a delimited file.

spring.batch.job.flatfileitemwriter.encoding

String

FlatFileItemReader.DEFAULT_CHARSET

Encoding to use when writing the file.

spring.batch.job.flatfileitemwriter.forceSync

boolean

false

Indicates whether a file should be force-synced to the disk on flush.

spring.batch.job.flatfileitemwriter.names

String []

null

List of names for each field parsed from a record. These names are the keys in the Map<String, Object> for the items received by this ItemWriter.

spring.batch.job.flatfileitemwriter.append

boolean

false

Indicates whether a file should be appended to if the output file is found.

spring.batch.job.flatfileitemwriter.lineSeparator

String

FlatFileItemWriter.DEFAULT_LINE_SEPARATOR

What String to use to separate lines in the output file.

spring.batch.job.flatfileitemwriter.name

String

null

Name used to provide unique keys in the ExecutionContext.

spring.batch.job.flatfileitemwriter.saveState

boolean

true

Determines whether the state should be saved for restarts.

spring.batch.job.flatfileitemwriter.shouldDeleteIfEmpty

boolean

false

If set to true, an empty file (there is no output) is deleted when the job completes.

spring.batch.job.flatfileitemwriter.shouldDeleteIfExists

boolean

true

If set to true and a file is found where the output file should be, it is deleted before the step begins.

spring.batch.job.flatfileitemwriter.transactional

boolean

FlatFileItemWriter.DEFAULT_TRANSACTIONAL

Indicates whether the reader is a transactional queue (indicating that the items read are returned to the queue upon a failure).

JdbcBatchItemWriter

要将步骤的输出写入关系数据库,此启动器提供了自动配置 JdbcBatchItemWriter 的功能。自动配置允许你提供自己的 ItemPreparedStatementSetterItemSqlParameterSourceProvider 和配置选项,通过设置以下属性:

To write the output of a step to a relational database, this starter provides the ability to autoconfigure a JdbcBatchItemWriter. The autoconfiguration lets you provide your own ItemPreparedStatementSetter or ItemSqlParameterSourceProvider and configuration options by setting the following properties:

Table 8. JdbcBatchItemWriter Properties
Property Type Default Value Description

spring.batch.job.jdbcbatchitemwriter.name

String

null

Name used to provide unique keys in the ExecutionContext.

spring.batch.job.jdbcbatchitemwriter.sql

String

null

The SQL used to insert each item.

spring.batch.job.jdbcbatchitemwriter.assertUpdates

boolean

true

Whether to verify that every insert results in the update of at least one record.

你还可以使用以下属性专门为写入器指定 JDBC 数据源:.JdbcBatchItemWriter 属性

You can also specify JDBC DataSource specifically for the writer by using the following properties: .JdbcBatchItemWriter Properties

Property Type Default Value Description

spring.batch.job.jdbcbatchitemwriter.datasource.enable

boolean

false

Determines whether JdbcCursorItemReader DataSource should be enabled.

jdbcbatchitemwriter.datasource.url

String

null

JDBC URL of the database.

jdbcbatchitemwriter.datasource.username

String

null

Login username of the database.

jdbcbatchitemwriter.datasource.password

String

null

Login password of the database.

jdbcbatchitemreader.datasource.driver-class-name

String

null

Fully qualified name of the JDBC driver.

JdbcBatchItemWriter 将使用默认 DataSource,如果未指定 jdbcbatchitemwriter_datasource

The default DataSource will be used by the JdbcBatchItemWriter if the jdbcbatchitemwriter_datasource is not specified.

KafkaItemWriter

要将步骤输出写入 Kafka 主题,您需要 KafkaItemWriter。此启动器通过使用来自两个位置的设施为 KafkaItemWriter 提供自动配置。第一个是 Spring Boot 的 Kafka 自动配置。(请参见 Spring Boot Kafka documentation。)其次,此启动器允许您在编写器上配置两个属性。

To write step output to a Kafka topic, you need KafkaItemWriter. This starter provides autoconfiguration for a KafkaItemWriter by using facilities from two places. The first is Spring Boot’s Kafka autoconfiguration. (See the Spring Boot Kafka documentation.) Second, this starter lets you configure two properties on the writer.

Table 9. KafkaItemWriter Properties
Property Type Default Value Description

spring.batch.job.kafkaitemwriter.topic

String

null

The Kafka topic to which to write.

spring.batch.job.kafkaitemwriter.delete

boolean

false

Whether the items being passed to the writer are all to be sent as delete events to the topic.

有关 KafkaItemWriter 的配置选项的更多信息,请参见 KafkaItemWiter documentation

For more about the configuration options for the KafkaItemWriter, see the KafkaItemWiter documentation.

Spring AOT

当将 Spring AOT 与单步批处理启动器一起使用时,你必须在编译时设置读取和写入名称属性(除非你为读取器和/或写入器创建了 Bean)。为此,你必须将希望用作参数或环境变量的读取器和写入器的名称包含在引导 maven 插件或 gradle 插件中。例如,如果你希望在 Maven 中启用 FlatFileItemReaderFlatFileItemWriter,它看起来像:

When using Spring AOT with Single Step Batch Starter you must set the reader and writer name properties at compile time (unless you create a bean(s) for the reader and or writer). To do this you must include the name of the reader and writer that you wish to use as and argument or environment variable in the boot maven plugin or gradle plugin. For example if you wish to enable the FlatFileItemReader and FlatFileItemWriter in Maven it would look like:

    <plugin>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-maven-plugin</artifactId>
        <executions>
            <execution>
            <id>process-aot</id>
            <goals>
                <goal>process-aot</goal>
            </goals>
            </execution>
        </executions>
        <configuration>
            <arguments>
                <argument>--spring.batch.job.flatfileitemreader.name=foobar</argument>
                <argument>--spring.batch.job.flatfileitemwriter.name=fooWriter</argument>
            </arguments>
        </configuration>
    </plugin>