Batch
-
关联作业执行与执行任务,以便从一个追溯到另一个
-
使用 Spring Cloud Deployer 进行远程分区,为远程批处理任务启动并配置 Spring Boot Uber-jar
-
在 Kubernetes 平台上部署分区的应用程序时需考虑的注意事项
-
批处理通知消息和批处理作业退出代码处理
此章节详细介绍了 Spring Cloud 任务与 Spring Batch 的集成。本文档中包含跟踪作业执行与执行作业的任务之间的关联以及通过 Spring Cloud Deployer 进行远程分区。
This section goes into more detail about Spring Cloud Task’s integration with Spring Batch. Tracking the association between a job execution and the task in which it was executed as well as remote partitioning through Spring Cloud Deployer are covered in this section.
Associating a Job Execution to the Task in which It Was Executed
Spring Boot 提供了在 Spring Boot Uber-jar 中执行批处理作业的工具。Spring Boot 对此功能的支持允许开发者在该执行中执行多个批处理作业。Spring Cloud 任务提供了将作业执行(作业执行)与任务执行关联起来的能力,以便可以从一个追溯到另一个。
Spring Boot provides facilities for the execution of batch jobs within a Spring Boot Uber-jar. Spring Boot’s support of this functionality lets a developer execute multiple batch jobs within that execution. Spring Cloud Task provides the ability to associate the execution of a job (a job execution) with a task’s execution so that one can be traced back to the other.
Spring Cloud 任务通过使用 TaskBatchExecutionListener
来实现此功能。默认情况下,此侦听器在任何具有已配置的 Spring Batch 作业(通过在上下文中定义 Job
类型 bean)和类路径上具有 spring-cloud-task-batch
jar 的上下文中自动配置。侦听器注入到所有满足这些条件的作业中。
Spring Cloud Task achieves this functionality by using the TaskBatchExecutionListener
.
By default,
this listener is auto configured in any context that has both a Spring Batch Job
configured (by having a bean of type Job
defined in the context) and the
spring-cloud-task-batch
jar on the classpath. The listener is injected into all jobs
that meet those conditions.
Overriding the TaskBatchExecutionListener
若要防止将侦听器注入到当前上下文中的任何批处理作业中,可以使用标准的 Spring Boot 机制禁用自动配置。
To prevent the listener from being injected into any batch jobs within the current context, you can disable the autoconfiguration by using standard Spring Boot mechanisms.
若仅将侦听器注入到上下文中的特定作业中,则覆盖 batchTaskExecutionListenerBeanPostProcessor
并提供作业 Bean ID 列表,如下例所示:
To only have the listener injected into particular jobs within the context, override the
batchTaskExecutionListenerBeanPostProcessor
and provide a list of job bean IDs, as shown
in the following example:
public static TaskBatchExecutionListenerBeanPostProcessor batchTaskExecutionListenerBeanPostProcessor() {
TaskBatchExecutionListenerBeanPostProcessor postProcessor =
new TaskBatchExecutionListenerBeanPostProcessor();
postProcessor.setJobNames(Arrays.asList(new String[] {"job1", "job2"}));
return postProcessor;
}
可在 Spring Cloud Task 项目 here 的示例模块中找到一个示例批处理应用程序。 |
You can find a sample batch application in the samples module of the Spring Cloud Task Project, here. |
Remote Partitioning
Spring Cloud Deployer 提供了在大多数云基础设施上启动基于 Spring Boot 的应用程序的工具。DeployerPartitionHandler
和 DeployerStepExecutionHandler
将 worker 步骤执行的启动委派给 Spring Cloud Deployer。
Spring Cloud Deployer provides facilities for launching Spring Boot-based applications on
most cloud infrastructures. The DeployerPartitionHandler
and
DeployerStepExecutionHandler
delegate the launching of worker step executions to Spring
Cloud Deployer.
要配置 DeployerStepExecutionHandler
,必须提供一个代表要执行的 Spring Boot Uber-jar 的 Resource
、一个 TaskLauncherHandler
和一个 JobExplorer
。可以配置任何环境属性以及要同时执行的最大 worker 数、轮询结果的时间间隔(默认为 10 秒)和超时时间(默认为 -1 或不超时)。以下示例显示了配置此 PartitionHandler
的外观:
To configure the DeployerStepExecutionHandler
, you must provide a Resource
representing the Spring Boot Uber-jar to be executed, a TaskLauncherHandler
, and a
JobExplorer
. You can configure any environment properties as well as the max number of
workers to be executing at once, the interval to poll for the results (defaults to 10
seconds), and a timeout (defaults to -1 or no timeout). The following example shows how
configuring this PartitionHandler
might look:
@Bean
public PartitionHandler partitionHandler(TaskLauncher taskLauncher,
JobExplorer jobExplorer) throws Exception {
MavenProperties mavenProperties = new MavenProperties();
mavenProperties.setRemoteRepositories(new HashMap<>(Collections.singletonMap("springRepo",
new MavenProperties.RemoteRepository(repository))));
Resource resource =
MavenResource.parse(String.format("%s:%s:%s",
"io.spring.cloud",
"partitioned-batch-job",
"1.1.0.RELEASE"), mavenProperties);
DeployerPartitionHandler partitionHandler =
new DeployerPartitionHandler(taskLauncher, jobExplorer, resource, "workerStep");
List<String> commandLineArgs = new ArrayList<>(3);
commandLineArgs.add("--spring.profiles.active=worker");
commandLineArgs.add("--spring.cloud.task.initialize.enable=false");
commandLineArgs.add("--spring.batch.initializer.enabled=false");
partitionHandler.setCommandLineArgsProvider(
new PassThroughCommandLineArgsProvider(commandLineArgs));
partitionHandler.setEnvironmentVariablesProvider(new NoOpEnvironmentVariablesProvider());
partitionHandler.setMaxWorkers(2);
partitionHandler.setApplicationName("PartitionedBatchJobTask");
return partitionHandler;
}
向分区传递环境变量时,不同的分区可能位于不同的机器上,环境设置不同。所以只应传递那些必需的环境变量。 |
When passing environment variables to partitions, each partition may be on a different machine with different environment settings. Consequently, you should pass only those environment variables that are required. |
请注意,在上面的示例中,我们已将最大 worker 数设置为 2。设置最大 worker 数会确定一次应运行的最大分区数。
Notice in the example above that we have set the maximum number of workers to 2. Setting the maximum of workers establishes the maximum number of partitions that should be running at one time.
要执行的 Resource
期望是一个 Spring Boot Uber-jar,其中 DeployerStepExecutionHandler
配置为当前上下文中的 CommandLineRunner
。前面示例中列举的存储库应该是其中 Spring Boot Uber-jar 所在的远程存储库。预期管理器和 worker 都具有对用作作业存储库和任务存储库的同一数据存储的可见性。一旦底层基础设施引导了 Spring Boot jar,Spring Boot 就会启动 DeployerStepExecutionHandler
,该步骤处理程序会执行请求的 Step
。以下示例展示了如何配置 DeployerStepExecutionHandler
:
The Resource
to be executed is expected to be a Spring Boot Uber-jar with a
DeployerStepExecutionHandler
configured as a CommandLineRunner
in the current context.
The repository enumerated in the preceding example should be the remote repository in
which the Spring Boot Uber-jar is located. Both the manager and worker are expected to have visibility
into the same data store being used as the job repository and task repository. Once the
underlying infrastructure has bootstrapped the Spring Boot jar and Spring Boot has
launched the DeployerStepExecutionHandler
, the step handler executes the requested
Step
. The following example shows how to configure the DeployerStepExecutionHandler
:
@Bean
public DeployerStepExecutionHandler stepExecutionHandler(JobExplorer jobExplorer) {
DeployerStepExecutionHandler handler =
new DeployerStepExecutionHandler(this.context, jobExplorer, this.jobRepository);
return handler;
}
可在 Spring Cloud Task 项目 here 的示例模块中找到一个示例远程分区应用程序。 |
You can find a sample remote partition application in the samples module of the Spring Cloud Task project, here. |
Asynchronously launch remote batch partitions
默认情况下,批处理分区按顺序启动。但是,在某些情况下,这可能会影响性能,因为每次启动都会阻塞,直到资源(例如:在 Kubernetes 中配置容器)配置好。在这些情况下,可以为 DeployerPartitionHandler
提供一个 ThreadPoolTaskExecutor
。这将基于 ThreadPoolTaskExecutor
的配置来启动远程批处理分区。例如:
By default batch partitions are launched sequentially. However, in some cases this may affect performance as each launch will block until the resource (For example: provisioning a pod in Kubernetes) is provisioned.
In these cases you can provide a ThreadPoolTaskExecutor
to the DeployerPartitionHandler
. This will launch the remote batch partitions based on the configuration of the ThreadPoolTaskExecutor
.
For example:
@Bean
public ThreadPoolTaskExecutor threadPoolTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(4);
executor.setThreadNamePrefix("default_task_executor_thread");
executor.setWaitForTasksToCompleteOnShutdown(true);
executor.initialize();
return executor;
}
@Bean
public PartitionHandler partitionHandler(TaskLauncher taskLauncher, JobExplorer jobExplorer,
TaskRepository taskRepository, ThreadPoolTaskExecutor executor) throws Exception {
Resource resource = this.resourceLoader
.getResource("maven://io.spring.cloud:partitioned-batch-job:2.2.0.BUILD-SNAPSHOT");
DeployerPartitionHandler partitionHandler =
new DeployerPartitionHandler(taskLauncher, jobExplorer, resource,
"workerStep", taskRepository, executor);
...
}
我们需要关闭上下文,因为 |
We need to close the context since the use of |
Notes on Developing a Batch-partitioned application for the Kubernetes Platform
-
When deploying partitioned apps on the Kubernetes platform, you must use the following dependency for the Spring Cloud Kubernetes Deployer:[source, xml]
<dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-deployer-kubernetes</artifactId> </dependency>
-
The application name for the task application and its partitions need to follow the following regex pattern:
[a-z0-9]([-a-z0-9]*[a-z0-9])
. Otherwise, an exception is thrown.
Batch Informational Messages
Spring Cloud Task 为批量作业提供了发送信息消息的能力。“Spring Batch Events” 部分详细介绍了此功能。
Spring Cloud Task provides the ability for batch jobs to emit informational messages. The “Spring Batch Events” section covers this feature in detail.
Batch Job Exit Codes
如 earlier 所述,Spring Cloud Task 应用程序支持记录任务执行的退出代码。但是,如果您在任务中运行 Spring Batch Job,则无论 Batch JobExecution 如何完成,使用默认 Batch/Boot 行为时,任务的结果始终为零。请记住,任务是一个引导应用程序,并且从任务返回的退出代码与引导应用程序的退出代码相同。要覆盖此行为并允许任务在批量作业返回 BatchStatus 的 FAILED
时返回非零退出代码,请将 spring.cloud.task.batch.fail-on-job-failure
设置为 true
。然后退出代码可以为 1(默认值),也可以基于 specified
ExitCodeGenerator
)。
As discussed earlier, Spring Cloud Task
applications support the ability to record the exit code of a task execution. However, in
cases where you run a Spring Batch Job within a task, regardless of how the Batch Job
Execution completes, the result of the task is always zero when using the default
Batch/Boot behavior. Keep in mind that a task is a boot application and that the exit code
returned from the task is the same as a boot application.
To override this behavior and allow the task to return an exit code other than zero when a
batch job returns an
BatchStatus
of FAILED
, set spring.cloud.task.batch.fail-on-job-failure
to true
. Then the exit code
can be 1 (the default) or be based on the
specified
ExitCodeGenerator
)
此功能使用一个新的 ApplicationRunner
,该功能替换了 Spring Boot 提供的功能。默认情况下,它使用相同顺序进行配置。但是,如果你想要自定义 ApplicationRunner
的运行顺序,则可以通过设置 spring.cloud.task.batch.applicationRunnerOrder
属性来设置其顺序。要使任务根据批处理作业执行的结果返回退出码,你需编写自己的 CommandLineRunner
。
This functionality uses a new ApplicationRunner
that replaces the one provided by Spring
Boot. By default, it is configured with the same order. However, if you want to customize
the order in which the ApplicationRunner
is run, you can set its order by setting the
spring.cloud.task.batch.applicationRunnerOrder
property. To have your task return the
exit code based on the result of the batch job execution, you need to write your own
CommandLineRunner
.