-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 
Author Message
 Post subject: Optimizing the persistence of collections
PostPosted: Sat Apr 10, 2004 12:23 pm 
Newbie

Joined: Thu Apr 08, 2004 10:24 am
Posts: 19
Location: Raleigh, NC
Greetings!

i posted this previously in the newbie forum but thought it would be more appropriate here.

i am using Hibernate 2.1.2 and Oracle9i.

i have a program that creates and inserts a large collection of objects during a single transaction. Each persisted object includes a bi-directional relationship with another object that is also persisted. This relationship breaks hibernate's ability to perform batched insertions. For example:


Classes:
Code:
public class Foo extends Persistent
{
   private Bar bar;

   public Bar getBar() { return bar; }
   public void setBar(Bar bar) { this.bar = bar; }
}

public class Bar extends Persistent
{
   private Foo foo;

   public Foo getFoo() { return foo; }
   public void setFoo(Foo foo) { this.foo = foo; }
}

Hibernate Config:
Code:
<hibernate-mapping>
   <class name="Foo" table="Foo">
      <id name="id" column="id">
         <generator class="myUUIDGenerator"/>
      </id>
      ...
      <many-to-one
         name="bar"class="Bar"
         cascade="all"
         column="bar"
      />   
   </class>

   <class name="Bar" table="Bar">
      <id name="id" column="id">
         <generator class="myUUIDGenerator"/>
      </id>
      ...
      <many-to-one
         name="foo"class="Foo"
         cascade="all"
         column="foo"
      />   
   </class>
</hibernate-mapping>

Client (pseudocode):
Code:
public class Client
{
   public static void main(String[] args)
   {
      Collection objects = new ArrayList();
      for (int i = 0; i < 5000; i++)
      {
         Foo foo = new Foo();
         Bar bar = new Bar();
         foo.setBar(bar);
         bar.setFoo(foo);
         objects.add(foo);
      }
     
      myService.create(objects);
   }
}


Service (pseudocode):
Code:
public class MyService()
{
   public void create(Collection objects)
   {
      // start hibernate transaction
      for (Iterator iter = objects.iterator(); iter.hasNext(); )
      {
         hibernateSession.saveOrUpdate(iter.next());
      }
      // commit hibernate transaction
   }
}


When i execute this code Hibernate (bless it) ignores my batch size. This is because the save/update cascade setting causes two different INSERT commands to be issued for each iteration in the loop (INSERT INTO FOO and INSERT INTO BAR):

Hibernate: BatcherImpl::prepareBatchStatement()
Code:
   public PreparedStatement prepareBatchStatement(String sql) throws SQLException, HibernateException {
   if ( !sql.equals(batchUpdateSQL) ) {
      batchUpdate=prepareStatement(sql); // calls executeBatch()
      batchUpdateSQL=sql;
   }
   return batchUpdate;
}



the (perceived?) inability to use batching leads to an order of magnitude degredation in performance when a network is involved.

i can force Hibernate to use batching if i change my cascade settings to 'delete' instead of 'all' and if i make the following modification to my service class:
Code:
public class MyService()
{
   public void create(Collection objects)
   {
      // start hibernate transaction
      for (Iterator iter = objects.iterator(); iter.hasNext(); )
      {
         Foo foo = (Foo)iter.next();
         hibernateSession.saveOrUpdate(foo);
      }
      for (Iterator iter = objects.iterator(); iter.hasNext(); )
      {
         Foo foo = (Foo)iter.next();
         Bar bar = foo.getBar();
         hibernateSession.saveOrUpdate(bar);
      }
      // commit hibernate transaction
   }
}


Execution time for my test class drops from 45 seconds to 3 seconds.

Now, would it be possible (if it has not already been done) to optimize Hibernate for the bulk insertion of related/joined pojos/tables? for example, the Hibernate Session object could be extended to provide a saveOrUpdate(Collection objects) method. The implementation of this method could persist the objects type by type in a single transaction (i.e. persist all of the Foo's then persist all of the Bar's). This would allow the use of batching and would substantially improve performance.

I am a newbie to Hibernate and it may offer such a feature already. if so, i apologize.

Brad

p.s. i think the Hibernate product is outstanding.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 9:07 am 
Newbie

Joined: Fri Apr 16, 2004 9:06 am
Posts: 2
Wow this is fantastic - is there any way this could make it into the next release? Gavin? Anyone?


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 10:38 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Well ... add a feature request to JIRA and I'll *think* about it. Probably very easy to implement, now that Steve has done the new Event architecture.

I'm not too keen on it though.


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 11:41 am 
Newbie

Joined: Thu Apr 08, 2004 10:24 am
Posts: 19
Location: Raleigh, NC
gavin wrote:
Well ... add a feature request to JIRA and I'll *think* about it. Probably very easy to implement, now that Steve has done the new Event architecture.

I'm not too keen on it though.


Gavin,

the only way i see to utilize batching when saving related entities is to:

1. ensure the relationship is not a subclass - subclasses kill batching even if you are only persisting the super class.

2. ensure that cascading between the join is set to none or delete. all and saveUpdate also break batching.

3. in the application layer, manually persist entities class by class.

using this technique has yielded an order of magnitude increase in performance.

do you forsee any negative impact this optimization might have? obviously, all properties must be known before the insert (e.g. can't use a database sequencer for a primary key).



brad


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 11:53 am 
Newbie

Joined: Fri Apr 16, 2004 9:06 am
Posts: 2
gavin wrote:
Well ... add a feature request to JIRA and I'll *think* about it. Probably very easy to implement, now that Steve has done the new Event architecture.

I'm not too keen on it though.


ON JIRA?

I'll do it if bleupen doesnt first. Just wanted to add though that after seeing this post today, I implemented and experienced similar results. Was able to cascade nearly 9000 entities in less than 8 seconds. Without bleupens code it takes well over a minute (with minimal network latency).


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 12:05 pm 
Newbie

Joined: Thu Apr 08, 2004 10:24 am
Posts: 19
Location: Raleigh, NC
gavin,

thinking about it some more, i dont believe an api change is necessary. currently, hibernate stores the last sql statement in the batchUpdateSQL variable and commits the batch whenever the insert command changes (hence the reason why persisting related entities breaks the batch - it inserts entity A then inserts related entity B). would it be possible to maintain a hashtable of inserts keyed on the entity type or the insert command, itself? this way, saveOrUpdate() could manage multiple batches simultaneously during a single transaction.

brad


Top
 Profile  
 
 Post subject:
PostPosted: Fri Apr 16, 2004 10:52 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
The problem with that solution (using Batcher) is that inserts would happen out of order with what the rest of Hibernate expects, and so create the risk of FK violations.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Apr 19, 2004 10:16 am 
Newbie

Joined: Thu Apr 08, 2004 10:24 am
Posts: 19
Location: Raleigh, NC
gavin wrote:
The problem with that solution (using Batcher) is that inserts would happen out of order with what the rest of Hibernate expects, and so create the risk of FK violations.


intuitively i understand your point. however, i am having difficulty thinking of a specific example.

do you feel that this would be a problem if session were extended to support saveOrUpdate(Collection entities)?

b


Top
 Profile  
 
 Post subject:
PostPosted: Mon Apr 19, 2004 10:31 am 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Changing the API, by adding that method would solve the problem.


Top
 Profile  
 
 Post subject:
PostPosted: Tue Oct 11, 2005 1:35 pm 
Newbie

Joined: Tue Oct 11, 2005 11:26 am
Posts: 1
hello,

I tried bleupen's suggestions but I still was not able to improve performance persisting a collection.

Instead of a many-to-many relationship I have a simple many-to-one. Using the classes in bleupen's explanation, this is what I have:

Code:
<hibernate-mapping>
   <class name="Foo" table="Foo">
      <id name="id" column="id">
         <generator class="myUUIDGenerator"/>
      </id>
      ...
      <set name="barSet" inverse="true" cascade="all">
         <key column="id"/>
         <one-to-many class="Bar"/>
      </set>
   </class>

   <class name="Bar" table="Bar">
      <id name="id" column="id">
         <generator class="myUUIDGenerator"/>
      </id>
      ...
      <many-to-one
         name="foo"class="Foo"
         not-null="true">
         <column name="fooId"/>
      </many-to-one>

   </class>
</hibernate-mapping>


The rest is the same as in bleupen's example.

According to bleupen's advices, I changed cascade to delete and MyService() method to:

Code:
public class MyService()
{
   public void create(Collection objects)
   {
      // start hibernate transaction
      for (Iterator iter = objects.iterator(); iter.hasNext(); )
      {
         Foo foo = (Foo)iter.next();
         hibernateSession.saveOrUpdate(foo);

         Bar bar = foo.getBar();
         hibernateSession.saveOrUpdate(bar);
      }
      // commit hibernate transaction
   }
}


However I got an net.sf.hibernate.TransientObjectException.

What should I be doing differently in order to avoid this exception?

Any hint about how I can force hibernate to use the batch update here?
Code:


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 10 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.