JDBC Batch Operations
如果你对同一准备语句批量执行多个调用,大多数 JDBC 驱动程序都会提供更好的性能。通过将更新分组到批处理中,你可以限制往返于数据库的次数。
Most JDBC drivers provide improved performance if you batch multiple calls to the same prepared statement. By grouping updates into batches, you limit the number of round trips to the database.
Basic Batch Operations with JdbcTemplate
你可以通过实现特殊接口 BatchPreparedStatementSetter
的两种方法并将其作为 batchUpdate
方法调用的第二个参数,来完成 JdbcTemplate
批处理。你可以使用 getBatchSize
方法来提供当前批处理的大小。你可以使用 setValues
方法来设置准备语句的参数值。这个方法被调用了 getBatchSize
调用中你指定过的次数。以下示例基于列表中的条目更新 t_actor
表,并使用整个列表作为批处理:
You accomplish JdbcTemplate
batch processing by implementing two methods of a special interface,
BatchPreparedStatementSetter
, and passing that implementation in as the second parameter
in your batchUpdate
method call. You can use the getBatchSize
method to provide the size of
the current batch. You can use the setValues
method to set the values for the parameters of
the prepared statement. This method is called the number of times that you specified in the
getBatchSize
call. The following example updates the t_actor
table based on entries in a list,
and the entire list is used as the batch:
-
Java
-
Kotlin
public class JdbcActorDao implements ActorDao {
private JdbcTemplate jdbcTemplate;
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
public int[] batchUpdate(final List<Actor> actors) {
return this.jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?",
new BatchPreparedStatementSetter() {
public void setValues(PreparedStatement ps, int i) throws SQLException {
Actor actor = actors.get(i);
ps.setString(1, actor.getFirstName());
ps.setString(2, actor.getLastName());
ps.setLong(3, actor.getId().longValue());
}
public int getBatchSize() {
return actors.size();
}
});
}
// ... additional methods
}
class JdbcActorDao(dataSource: DataSource) : ActorDao {
private val jdbcTemplate = JdbcTemplate(dataSource)
fun batchUpdate(actors: List<Actor>): IntArray {
return jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?",
object: BatchPreparedStatementSetter {
override fun setValues(ps: PreparedStatement, i: Int) {
ps.setString(1, actors[i].firstName)
ps.setString(2, actors[i].lastName)
ps.setLong(3, actors[i].id)
}
override fun getBatchSize() = actors.size
})
}
// ... additional methods
}
如果你处理一个更新流或从一个文件读取,你可能有一个首选的批处理大小,但最后一个批处理可能没有那么多的条目。在这种情况下,你可以使用 InterruptibleBatchPreparedStatementSetter
接口,它允许你在输入源耗尽后中断批处理。isBatchExhausted
方法让你发出批处理结束的信号。
If you process a stream of updates or reading from a file, you might have a
preferred batch size, but the last batch might not have that number of entries. In this
case, you can use the InterruptibleBatchPreparedStatementSetter
interface, which lets
you interrupt a batch once the input source is exhausted. The isBatchExhausted
method
lets you signal the end of the batch.
Batch Operations with a List of Objects
JdbcTemplate
和 NamedParameterJdbcTemplate
都提供了一种提供批处理更新的替代方法。你不需要实现一个特殊的批处理接口,而是提供所有参数值作为调用中的一个列表。框架循环遍历这些值并使用一个内部准备语句设置器。API 会根据你是否使用命名参数而有所不同。对于命名参数,你提供一个 SqlParameterSource
数组,批处理的每个成员一个条目。你可以使用 SqlParameterSourceUtils.createBatch
方便方法来创建此数组,它可以传递一个 Bean 式对象的数组(带有与参数相对应的 getter 方法)、String
键控的 Map
实例(包含相应参数作为值)、或两者结合。
Both the JdbcTemplate
and the NamedParameterJdbcTemplate
provides an alternate way
of providing the batch update. Instead of implementing a special batch interface, you
provide all parameter values in the call as a list. The framework loops over these
values and uses an internal prepared statement setter. The API varies, depending on
whether you use named parameters. For the named parameters, you provide an array of
SqlParameterSource
, one entry for each member of the batch. You can use the
SqlParameterSourceUtils.createBatch
convenience methods to create this array, passing
in an array of bean-style objects (with getter methods corresponding to parameters),
String
-keyed Map
instances (containing the corresponding parameters as values), or a mix of both.
以下示例演示使用命名参数的批处理更新:
The following example shows a batch update using named parameters:
-
Java
-
Kotlin
public class JdbcActorDao implements ActorDao {
private NamedParameterTemplate namedParameterJdbcTemplate;
public void setDataSource(DataSource dataSource) {
this.namedParameterJdbcTemplate = new NamedParameterJdbcTemplate(dataSource);
}
public int[] batchUpdate(List<Actor> actors) {
return this.namedParameterJdbcTemplate.batchUpdate(
"update t_actor set first_name = :firstName, last_name = :lastName where id = :id",
SqlParameterSourceUtils.createBatch(actors));
}
// ... additional methods
}
class JdbcActorDao(dataSource: DataSource) : ActorDao {
private val namedParameterJdbcTemplate = NamedParameterJdbcTemplate(dataSource)
fun batchUpdate(actors: List<Actor>): IntArray {
return this.namedParameterJdbcTemplate.batchUpdate(
"update t_actor set first_name = :firstName, last_name = :lastName where id = :id",
SqlParameterSourceUtils.createBatch(actors));
}
// ... additional methods
}
对于使用经典 ?
占位符的 SQL 语句,你传递一个包含更新值的 Object 数组的列表。这个 Object 数组必须为 SQL 语句中的每个占位符有一个条目,并且它们必须按在 SQL 语句中定义的顺序排列。
For an SQL statement that uses the classic ?
placeholders, you pass in a list
containing an object array with the update values. This object array must have one entry
for each placeholder in the SQL statement, and they must be in the same order as they are
defined in the SQL statement.
以下示例与前一个示例相同,只不过它使用经典的 JDBC ?
占位符:
The following example is the same as the preceding example, except that it uses classic
JDBC ?
placeholders:
-
Java
-
Kotlin
public class JdbcActorDao implements ActorDao {
private JdbcTemplate jdbcTemplate;
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
public int[] batchUpdate(final List<Actor> actors) {
List<Object[]> batch = new ArrayList<>();
for (Actor actor : actors) {
Object[] values = new Object[] {
actor.getFirstName(), actor.getLastName(), actor.getId()};
batch.add(values);
}
return this.jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?",
batch);
}
// ... additional methods
}
class JdbcActorDao(dataSource: DataSource) : ActorDao {
private val jdbcTemplate = JdbcTemplate(dataSource)
fun batchUpdate(actors: List<Actor>): IntArray {
val batch = mutableListOf<Array<Any>>()
for (actor in actors) {
batch.add(arrayOf(actor.firstName, actor.lastName, actor.id))
}
return jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?", batch)
}
// ... additional methods
}
我们之前描述的所有批处理更新方法都返回一个包含每个批处理条目的受影响行数的 int
数组。此计数由 JDBC 驱动程序报告。如果计数不可用,JDBC 驱动程序返回 -2
。
All of the batch update methods that we described earlier return an int
array
containing the number of affected rows for each batch entry. This count is reported by
the JDBC driver. If the count is not available, the JDBC driver returns a value of -2
.
在这样的情况下,在底层`PreparedStatement`上自动设置值时,每个值的相应JDBC类型需要从给定的Java类型中推演出。虽然这通常能很好地发挥作用,但可能会出现问题(例如使用Map所包含的`null`值),在这种情况下,Spring默认调用`ParameterMetaData.getParameterType`,而这可能会给你的JDBC驱动增加工作量。如果你遇到性能问题(据Oracle 12c、JBoss和PostgreSQL报告),则应当使用最近的驱动程序版本并考虑将`spring.jdbc.getParameterType.ignore`属性设置为`true`(作为JVM系统属性或通过xref:appendix.adoc#appendix-spring-properties[`SpringProperties`机制)。 In such a scenario, with automatic setting of values on an underlying 或者,您可以考虑显式地指定 JDBC 类型,通过 Alternatively, you might consider specifying the corresponding JDBC types explicitly,
either through a |
Batch Operations with Multiple Batches
前面批处理更新的示例处理的批处理很大,你希望将它们分成几个较小的批处理。你可以通过使用前面提到的方法多次调用 batchUpdate
方法来实现此操作,但现在有一个更方便的方法。除了 SQL 语句之外,此方法还接收一个包含参数的对象 Collection
、每个批处理要执行的更新次数,以及一个 ParameterizedPreparedStatementSetter
来设置准备 statement 的参数值。框架循环遍历提供的值,并将更新调用分解为指定大小的批处理。
The preceding example of a batch update deals with batches that are so large that you want to
break them up into several smaller batches. You can do this with the methods
mentioned earlier by making multiple calls to the batchUpdate
method, but there is now a
more convenient method. This method takes, in addition to the SQL statement, a
Collection
of objects that contain the parameters, the number of updates to make for each
batch, and a ParameterizedPreparedStatementSetter
to set the values for the parameters
of the prepared statement. The framework loops over the provided values and breaks the
update calls into batches of the size specified.
以下示例展示了使用 100 的批量大小的批量更新:
The following example shows a batch update that uses a batch size of 100:
-
Java
-
Kotlin
public class JdbcActorDao implements ActorDao {
private JdbcTemplate jdbcTemplate;
public void setDataSource(DataSource dataSource) {
this.jdbcTemplate = new JdbcTemplate(dataSource);
}
public int[][] batchUpdate(final Collection<Actor> actors) {
int[][] updateCounts = jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?",
actors,
100,
(PreparedStatement ps, Actor actor) -> {
ps.setString(1, actor.getFirstName());
ps.setString(2, actor.getLastName());
ps.setLong(3, actor.getId().longValue());
});
return updateCounts;
}
// ... additional methods
}
class JdbcActorDao(dataSource: DataSource) : ActorDao {
private val jdbcTemplate = JdbcTemplate(dataSource)
fun batchUpdate(actors: List<Actor>): Array<IntArray> {
return jdbcTemplate.batchUpdate(
"update t_actor set first_name = ?, last_name = ? where id = ?",
actors, 100) { ps, argument ->
ps.setString(1, argument.firstName)
ps.setString(2, argument.lastName)
ps.setLong(3, argument.id)
}
}
// ... additional methods
}
该调用对此批量更新方法返回的是一个 int
数组,它包括一个针对每次更新返回受影响行数的数组的数组条目。顶层数组长度指示运行的批量的数量,而第二层数组长度指示该批量的更新数量。如果提供的更新对象的总数量根据批量大小,每个批量的更新数量应是所有批量的批量大小(只有最后一个可能少于)。每次更新语句的更新计数是由 JDBC 驱动程序报告的那个。如果计数不可用,JDBC 驱动程序会返回 -2 值。
The batch update method for this call returns an array of int
arrays that contains an
array entry for each batch with an array of the number of affected rows for each update.
The top-level array’s length indicates the number of batches run, and the second level
array’s length indicates the number of updates in that batch. The number of updates in
each batch should be the batch size provided for all batches (except that the last one
that might be less), depending on the total number of update objects provided. The update
count for each update statement is the one reported by the JDBC driver. If the count is
not available, the JDBC driver returns a value of -2
.