Spring Batch Introduction
企业领域中的许多应用程序需要批处理来在任务关键环境中执行业务操作。这些业务操作包括:
Many applications within the enterprise domain require bulk processing to perform business operations in mission-critical environments. These business operations include:
-
Automated, complex processing of large volumes of information that is most efficiently processed without user interaction. These operations typically include time-based events (such as month-end calculations, notices, or correspondence).
-
Periodic application of complex business rules processed repetitively across very large data sets (for example, insurance benefit determination or rate adjustments).
-
Integration of information that is received from internal and external systems that typically requires formatting, validation, and processing in a transactional manner into the system of record. Batch processing is used to process billions of transactions every day for enterprises.
Spring Batch 是一个轻量级、全面的批处理框架,旨在支持开发强大的批处理应用程序,这些应用程序对于企业系统的日常操作至关重要。Spring Batch 建立在人们所期望的 Spring Framework 特征(高效、基于 POJO 的开发方法和整体易用性)之上,同时在必要时让开发人员可以轻松访问和使用更高级的企业服务。Spring Batch 不是一个调度框架。企业中有许多优秀的调度程序(如 Quartz、Tivoli、Control-M 等)可供使用,既有商业版也有开源版。Spring Batch 的用途是与调度程序配合使用,而不是取代调度程序。
Spring Batch is a lightweight, comprehensive batch framework designed to enable the development of robust batch applications that are vital for the daily operations of enterprise systems. Spring Batch builds upon the characteristics of the Spring Framework that people have come to expect (productivity, POJO-based development approach, and general ease of use), while making it easy for developers to access and use more advanced enterprise services when necessary. Spring Batch is not a scheduling framework. There are many good enterprise schedulers (such as Quartz, Tivoli, Control-M, and others) available in both the commercial and open source spaces. Spring Batch is intended to work in conjunction with a scheduler rather than replace a scheduler.
Spring Batch 提供可重用函数,这些函数对于处理大量记录至关重要,包括日志记录和跟踪、事务管理、作业处理统计、作业重启、跳过和资源管理。它还提供更高级的技术服务和功能,通过优化和分区技术支持处理海量数据的高性能批处理作业。你可以在简单用例(例如将文件读入数据库或运行存储过程)和复杂的海量用例(例如在数据库之间移动大量数据和对其进行转换)中使用 Spring Batch。海量批处理作业可以以高度可扩展的方式使用该框架来处理大量信息。
Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging and tracing, transaction management, job processing statistics, job restart, skip, and resource management. It also provides more advanced technical services and features that enable extremely high-volume and high performance batch jobs through optimization and partitioning techniques. You can use Spring Batch in both simple use cases (such as reading a file into a database or running a stored procedure) and complex, high volume use cases (such as moving high volumes of data between databases, transforming it, and so on). High-volume batch jobs can use the framework in a highly scalable manner to process significant volumes of information.
Background
尽管开源软件项目和相关社区更加关注基于 Web 和基于微服务的架构框架,但一直未能重视可重用的架构框架以适应基于 Java 的批处理需求,尽管企业 IT 环境中始终需要处理此类批处理。缺乏标准、可重用的批处理架构导致客户端企业 IT 功能中出现了大量一次性的内部解决方案。
While open source software projects and associated communities have focused greater attention on web-based and microservices-based architecture frameworks, there has been a notable lack of focus on reusable architecture frameworks to accommodate Java-based batch processing needs, despite continued needs to handle such processing within enterprise IT environments. The lack of a standard, reusable batch architecture has resulted in the proliferation of many one-off, in-house solutions developed within client enterprise IT functions.
SpringSource(现为 VMware)和埃森哲合作改变了这一点。埃森哲在实施批处理架构方面的行业和技术实践经验、SpringSource 深厚的技术经验,以及 Spring 久经考验的编程模型共同形成了一个自然而强大的合作伙伴关系,旨在创建高质量的、与市场相关的高端软件,以填补企业 Java 中的一个重要空白。两家公司与许多客户合作,这些客户通过开发基于 Spring 的批处理架构解决方案来解决类似的问题。此输入提供了一些有用的附加细节和实际约束,帮助确保解决方案可应用于客户提出的现实问题。
SpringSource (now VMware) and Accenture collaborated to change this. Accenture’s hands-on industry and technical experience in implementing batch architectures, SpringSource’s depth of technical experience, and Spring’s proven programming model together made a natural and powerful partnership to create high-quality, market-relevant software aimed at filling an important gap in enterprise Java. Both companies worked with a number of clients who were solving similar problems by developing Spring-based batch architecture solutions. This input provided some useful additional detail and real-life constraints that helped to ensure the solution can be applied to the real-world problems posed by clients.
埃森哲将以前专有的批处理架构框架贡献给了 Spring Batch 项目,以及推动支持、增强和现有功能集的提交人资源。埃森哲的贡献基于数十年来使用几代平台构建批处理架构的经验:大型机上的 COBOL、Unix 上的 C++,以及现在任何地方的 Java。
Accenture contributed previously proprietary batch processing architecture frameworks to the Spring Batch project, along with committer resources to drive support, enhancements, and the existing feature set. Accenture’s contribution was based upon decades of experience in building batch architectures with the last several generations of platforms: COBOL on mainframes, C++ on Unix, and, now, Java anywhere.
埃森哲与 SpringSource 之间的合作旨在促进企业用户在创建批处理应用程序时的软件处理方法、框架和工具标准化。希望为其企业 IT 环境交付标准且久经考验的解决方案的公司和政府机构可以受益于 Spring Batch。
The collaborative effort between Accenture and SpringSource aimed to promote the standardization of software processing approaches, frameworks, and tools enterprise users can consistently use when creating batch applications. Companies and government agencies desiring to deliver standard, proven solutions to their enterprise IT environments can benefit from Spring Batch.
Usage Scenarios
典型的批处理程序通常:
A typical batch program generally:
-
Reads a large number of records from a database, file, or queue.
-
Processes the data in some fashion.
-
Writes back data in a modified form.
Spring Batch 自动执行此基本批处理迭代,提供将类似事务处理为一组处理的能力,通常在没有用户交互的情况下在脱机环境中进行。批处理作业是大多数 IT 项目的一部分,而 Spring Batch 是唯一提供强大的企业级解决方案的开源框架。
Spring Batch automates this basic batch iteration, providing the capability to process similar transactions as a set, typically in an offline environment without any user interaction. Batch jobs are part of most IT projects, and Spring Batch is the only open source framework that provides a robust, enterprise-scale solution.
Business Scenarios
Spring Batch 支持以下业务场景:
Spring Batch supports the following business scenarios:
-
Commit batch process periodically.
-
Concurrent batch processing: parallel processing of a job.
-
Staged, enterprise message-driven processing.
-
Massively parallel batch processing.
-
Manual or scheduled restart after failure.
-
Sequential processing of dependent steps (with extensions to workflow-driven batches).
-
Partial processing: skip records (for example, on rollback).
-
Whole-batch transaction, for cases with a small batch size or existing stored procedures or scripts.
Technical Objectives
Spring Batch 具有以下技术目标:
Spring Batch has the following technical objectives:
-
Let batch developers use the Spring programming model: Concentrate on business logic and let the framework take care of the infrastructure.
-
Provide clear separation of concerns between the infrastructure, the batch execution environment, and the batch application.
-
Provide common, core execution services as interfaces that all projects can implement.
-
Provide simple and default implementations of the core execution interfaces that can be used “out of the box”.
-
Make it easy to configure, customize, and extend services, by using the Spring framework in all layers.
-
All existing core services should be easy to replace or extend, without any impact to the infrastructure layer.
-
Provide a simple deployment model, with the architecture JARs completely separate from the application, built by using Maven.