Hi all,
I'm running out of options to try to improve batch insert/update/delete to an MS SQL 2005 server (at least trying to use JPA).
Challenge: I've tried JTDS 1.2.8, MS SQL JDBC version 3.0 (the JDBC4 one), different variances of the configuration settings, and nothing seems to optimize the SQL statement sent to the database server (it just does a commit with XXXX many individual inserts instead of batches of 20). Expectation for a collection of 100 (without deletes) to instead create 5 insert statements that have 20 sets of data values each, or if only have 10 entities with inserts, to only have a single INSERT with 10 data values.
Options: I would prefer to keep this as JPA-like as possible(portability/upgrading), however if there are options that easily use the existing configurations/setup through JDBC or Hibernate to improve batch insert/update/delete performance/load to the database server that would also be great. Also have access to SQL 2008 or SQL 2012 if there are other options going there.
Non-options: we would prefer to avoid a heavy dependency spring solution, unless we could get by with only one or two jars with easy integration into the non-spring solution.
Concern: We've reached a point in volume where MS SQL will occasionally start Deadlocking related to the high volume of individual insert statements. Volumes are breaching 500,000 (i.e. the javaSE is multithreaded and pushing 800-1200 statements/sec according SQL Server Activity Monitor and 30%-80% through starts deadlocking).
================details==============
The persistence.xml has these configurations, as that seems to be the many options you need to have enabled for batching to work:
Code:
<property name="hibernate.show_sql" value="true"/>
<property name="hibernate.jdbc.fetch_size" value="20"/>
<property name="hibernate.max_fetch_depth" value="5"/>
<property name="hibernate.jdbc.batch_size" value="20"/>
<property name="hibernate.order_inserts" value="true"/>
<property name="hibernate.order_updates" value="true"/>
<property name="hibernate.cache.use_second_level_cache" value="false"/>
<property name="hibernate.cache.use_query_cache" value="false"/>
<property name="hibernate.jdbc.batch_versioned_data" value="true"/>
<!-- note, the batch_versioned_data is unknown, but changing the value does not change. We do not have optimistic locked/versioned data -->
The entity is setup for TableGenerator (no autoid on the table):
Code:
@TableGenerator(name="SEQUENCE_GENERATOR",table="sequence_generator",
pkColumnName="entity_id",
pkColumnValue="mytable",
valueColumnName="entity_sequence",
initialValue=0, allocationSize=50)
...
@Id
@GeneratedValue(strategy=GenerationType.TABLE, generator="SEQUENCE_GENERATOR")
@Column(name="ID")
private Integer id;
The code used to attempt to handle batch inserts (javaSE):
Code:
public void saveBatch(List<MyEntity> batch){ //batches are currently sized between20 and 200 related to an appropriate business processing grouping.
EntityManager em= emf.createEntityManager();
em.setFlushMode(FlushModeType.COMMIT);
em.getTransaction().begin();
for (Iterator iterator = batch.iterator(); iterator.hasNext();) {
MyEntity myEntity = (MyEntity ) iterator.next();
//purge/remove if AUTO_DELETED
if( myEntity.getStatusId() == MyStatus.AUTO_DELETED){
if(myEntity.getId() != null){
MyEntity merge = em.merge(myEntity); //make sure attached/managed
em.remove(merge); //remove managed version as deleted.
}
}else{
em.merge(myEntity ); //update or insert
}
}
System.out.println(".....should not see SQL until after this........");
em.getTransaction().commit(); //commit handles flush
I've confirmed that the JPA/hibernate stack is getting enabled:
Code:
org.hibernate.cfg.SettingsFactory - JDBC batch size: 20
org.hibernate.cfg.SettingsFactory - JDBC result set fetch size: 20
org.hibernate.cfg.SettingsFactory - Order SQL updates by primary key: enabled
org.hibernate.cfg.SettingsFactory - Order SQL inserts for batching: enabled
org.hibernate.cfg.SettingsFactory - Second-level cache: disabled
org.hibernate.cfg.SettingsFactory - Query cache: disabled
I do not want to copy the sql output as the above is anonymized and the SQL would take a while to anonymize, but I have both hibernate show_sql on, as well as JDBC logger (I've been going back and forth between p6spy, log4jdbc, and jdbcdslog) to verify the SQL going out.
I've seen a lot of examples with MySQL and the rewriteBatch connection flag, but not much on MS SQL.
Thanks for any assistance!
(Hibernate 3.6.10)
EDIT: I'm not necessarily looking for the fastest solution, just an easy to understand/maintainable solution of equivalent (or better) performance that doesn't knock the database server over ;-)
-D