Cameron McKenzie wrote:
Are you using executeUpdate or executeDelete for the batches? What method are you using on the session to do these batch processes? How many records are we talking.
Sometimes ETL tools are a better batch solution that Hibernate.
Hello Cameron,
My batch opens a session and iterates over a collection of 10,000 elements to generate an Excel report. Each object is linked with some others, but the graph isn't very deep (I don't have memory issues).
I don't update the database, I just retrieve object values (all these elements are retrieved at the very beginning of my batch process).
There is no problem with less than 1000 entities but over this value, the performence decreases by threshold (1400, 1900, 2300, ...) in an exponential way.
My analysis is that the performance loss comes from the implementation of the org.hibernate.engine.StatefulPersistenceContext class (and a bad use of it ...):
- The huge number of objects stored in the Maps of this class reduces the access speed when checking if an object isn't already in cache.
- When the initial capacity of these collections is reached, they must be resized, the cost of this operation is expensive when there is a numerous objects inside.
Then, the solution is to clear this cache explicitly all with clear() method or more accurately using session.evict().
Last detail, I know, it isn't really clean, but the batch doesn't work with the POJOs but with DTO, this is why I can use session.clean() without any risks : all the data has been transfered from POPJOs to DTOs.
Ektor.