I'm currently using OJB as O-R bridge, but want to switch to Hibernate because of the richer feature set.
Before switching, I decided to do a little performance comparison, and the results surprised me. I hope someone can find a bug in my Hibernate example or point to some relevant information for tuning my code.
The test uses 2 tables: a Basket table and a BasketItem table. One Basket can have many items (similar to the Blog/BlogItem example in the documentation).
In my test I have 100 Baskets with 1000 BasketItems in each Basket - a total of 100.000 rows.
I iterate through all Basket and BasketItem objects and measure the time used using 3 different methods:
Method 1: Using plain JDBC with the following pseudocode:
Code:
Baskets = select * from Basket
foreach basket in Baskets {
Items = select * from BasketItem where basketId=b.id
foreach item in Items {
doSometingWithItem
}
}
This results in 101 select statements and
it uses about 8 seconds to finish. CPU and memory usage is moderate (measured by watching the windows task manager).
Method 2:
Use OJB and
Code:
Query query = new QueryByCriteria(Basket.class, null);
Iterator i = broker.getIteratorByQuery(query);
foreach i:
foreach i.basketItems: doSomeThingWithItem()
This uses about 13 seconds to finish. Memory use is about 3 times (120Mb) as much as the plain JDBC code and CPU usage is slightly higher - this is still acceptable.
Method 3:
Use Hibernate... BTW: I'm using Hibernate 2.1.
I've tried different setups here, all with miserable results...
I'm using a List as the 1-m Collection. and I've tried the following:
Use outer-join and fetch: This was a totalt disaster (no big surprise as this is not encouraged in the Best Practices section in the documentation). This used about 120 seconds to finish and used about 540 Mb memory and 100% CPU... The problem with fetch is that it's not possible to get an iterator - I have to read all objects to memory first (at least I've not been able to get an iterator for fetch queries without reading it all into memory).
Second try: Use session.iterate("From BASKET in class Basket").
This is much better,
but it still uses about 22 seconds to finish and worse: it uses about 500Mb memory and 90%CPU...
I've experimented with different settings for lazy, cascade and batch-size parameters in the mapping file, but I've got no significant performance improvements with different settings.
Anyone who has a tip to improve performance? Iterating over large 1-m collections must be a common usage???