Multi-File Input

在单个 Step 中处理多个文件是一个常见需求。假设所有文件都具有相同的格式,则 MultiResourceItemReader 支持 XML 和平面文件处理的这种输入类型。考虑目录中的以下文件:

It is a common requirement to process multiple files within a single Step. Assuming the files all have the same formatting, the MultiResourceItemReader supports this type of input for both XML and flat file processing. Consider the following files in a directory:

file-1.txt  file-2.txt  ignored.txt

file-1.txt 和 file-2.txt 的格式相同,出于业务原因,应将它们一起处理。可以使用 MultiResourceItemReader 通过使用通配符来读取这两个文件。

file-1.txt and file-2.txt are formatted the same and, for business reasons, should be processed together. The MultiResourceItemReader can be used to read in both files by using wildcards.

Java

以下示例显示如何在 Java 中读取带通配符的文件:

The following example shows how to read files with wildcards in Java:

Java Configuration
@Bean
public MultiResourceItemReader multiResourceReader() {
	return new MultiResourceItemReaderBuilder<Foo>()
					.delegate(flatFileItemReader())
					.resources(resources())
					.build();
}
XML

以下示例显示如何在 XML 中读取带通配符的文件:

The following example shows how to read files with wildcards in XML:

XML Configuration
<bean id="multiResourceReader" class="org.spr...MultiResourceItemReader">
    <property name="resources" value="classpath:data/input/file-*.txt" />
    <property name="delegate" ref="flatFileItemReader" />
</bean>

引用的委托是简单的 FlatFileItemReader。上述配置从两个文件读取输入,处理回滚和重新启动场景。需要注意的是,与任何 ItemReader 一样,添加额外的输入(在本例中是一个文件)可能会在重新启动时引起潜在问题。建议批处理作业使用自己的单独目录,直到成功完成。

The referenced delegate is a simple FlatFileItemReader. The above configuration reads input from both files, handling rollback and restart scenarios. It should be noted that, as with any ItemReader, adding extra input (in this case a file) could cause potential issues when restarting. It is recommended that batch jobs work with their own individual directories until completed successfully.

输入资源按 MultiResourceItemReader#setComparator(Comparator) 排序,以确保在重启方案中,在作业运行之间保留资源的排序顺序。

Input resources are ordered by using MultiResourceItemReader#setComparator(Comparator) to make sure resource ordering is preserved between job runs in restart scenario.