-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 9 posts ] 
Author Message
 Post subject: Inserting large amount of rows
PostPosted: Wed Sep 27, 2006 12:04 pm 
Newbie

Joined: Wed Sep 27, 2006 11:16 am
Posts: 6
Location: Poland
Hi,
I've been using Hibernate 3.2.0.cr4. I experience serious performance problems when trying to insert large amount of rows into database, more or less 30k during one session. Code that I wrote for this task goes as follows:


Code:

       Transaction t = hibernateSession.beginTransaction();
       hibernateSession.setCacheMode(CacheMode.IGNORE);

             // processing data here ... skipped
             
      int newEntitiesCount = 0;
      while ( ... ) {
                      /* here I create persistent object cl, it contains persistent collection of its children */
          ChangeLog cl = createChangeLog(...);
          if (cl != null) {
         hibernateSession.save(cl);
         newEntitiesCount += (cl.getChangedFields().size() + 1);
          }
          if (newEntitiesCount >= BATCH_SIZE) {
         hibernateSession.flush();
         hibernateSession.clear();
         newEntitiesCount = 0;
          }
         
                       // irrelevant code goes next

      }
       }

             //processing more data

       for (...) {
      ChangeLog cl = createChangeLog(...);
      hibernateSession.save(cl);
      hibernateSession.flush();
      hibernateSession.clear();
       }
       tx.commit();
   } catch (Exception ex) {
              // irrelevant code goes next
   } finally {
       hibernateSession.close();
   }


In the loop above I usually process about 600 ChangeLog objects, each one contains about 40-50 children. In my mapping I set cascade save-update in order to propagate save operation down to child objects.

The code above is painfully slow, it takes more than 2 hours to execute and insert all newly created objects to db. I checked hibernate logs and I found that every insert follows the same pattern: first bunch of selects, then bunch of inserts. Having inspected execution times I found out that this "bunch of selects" is real overkill. Yet I could not figure out how to disable it. I suppose that hibernates tries to refresh cached objects, so I set explicitly cache mode to ignore, but with no measurable effects. I also try to clear session cache once in a while, still with no gain. Is there something I could to to improve performance?

_________________
--
Best Regards
Marcin Rzeźnicki


Top
 Profile  
 
 Post subject: same problem
PostPosted: Thu Sep 28, 2006 9:41 am 
Newbie

Joined: Fri Jun 16, 2006 3:41 pm
Posts: 18
I am also import large amounts of data (8mb worth for a single object hierarchy) and it takes around 3 seconds to parse from files and 3 minutes to write to the database. I haven't been able to solve it.


Top
 Profile  
 
 Post subject: Re: same problem
PostPosted: Thu Sep 28, 2006 9:46 am 
Newbie

Joined: Wed Sep 27, 2006 11:16 am
Posts: 6
Location: Poland
Budric wrote:
I am also import large amounts of data (8mb worth for a single object hierarchy) and it takes around 3 seconds to parse from files and 3 minutes to write to the database. I haven't been able to solve it.


I'd be glad if my imports took 3 minutes :-)
Anyway, I still do not have workaround for eliminating these selects I mentioned in my previous post. Did you find something useful on this subject?

_________________
--
Best Regards
Marcin Rzeźnicki


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 28, 2006 11:05 am 
Newbie

Joined: Wed Sep 27, 2006 1:15 pm
Posts: 10
can you provide the mapping file?


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 28, 2006 11:15 am 
Newbie

Joined: Wed Sep 27, 2006 11:16 am
Posts: 6
Location: Poland
caesardark wrote:
can you provide the mapping file?


Yes, of course. Here it goes:

hibernate.cfg


Code:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
   <session-factory>
      <property name="hibernate.bytecode.use_reflection_optimizer">
         true
      </property>
//credentials for db connection ... skipped :-)
      <property name="hibernate.dialect">
         org.hibernate.dialect.Oracle9Dialect
      </property>
      <!-- <property name="current_session_context_class">managed</property> -->
      <property name="hibernate.jdbc.batch_size">30</property>
      <property name="hibernate.c3p0.min_size">5</property>
      <property name="hibernate.c3p0.max_size">40</property>
      <property name="hibernate.c3p0.timeout">1800</property>
      <property name="hibernate.c3p0.max_statements">500</property>
      <property name="hibernate.max_fetch_depth">2</property>
      <property name="hibernate.default_batch_fetch_size">8</property>
      <property name="hibernate.cache.provider_class">
         org.hibernate.cache.EhCacheProvider
      </property>
      <mapping resource="cdbu/dal/CdStage.hbm.xml" />
      <mapping resource="cdbu/dal/CdNodes.hbm.xml" />
      <mapping resource="cdbu/dal/CdChangedFields.hbm.xml" />
      <mapping resource="cdbu/dal/CdChangeLogs.hbm.xml" />
      <mapping resource="cdbu/dal/CdLogs.hbm.xml" />
      <mapping resource="cdbu/dal/Users.hbm.xml" />
      <mapping resource="cdbu/dal/CdJobQueue.hbm.xml" />
   </session-factory>
</hibernate-configuration>



Parent mapping ( ChangeLog in the above example )

Code:

<hibernate-mapping>
   <class name="cdbu.dal.ChangeLog" table="CD_CHANGE_LOGS"
      dynamic-insert="true" mutable="false">
      <cache usage="read-only" />
      <id name="clId" type="long">
         <column name="CL_ID" precision="19" scale="0" />
         <generator class="sequence">
            <param name="sequence">cd_change_logs_seq</param>
         </generator>
      </id>
      <many-to-one name="job" class="cdbu.dal.Job" fetch="join">
         <column name="CL_JOB_FK" length="50" not-null="true" />
      </many-to-one>
      <property name="treePath" type="string">
         <column name="CL_TREE_PATH" not-null="true" />
      </property>
      <property name="changeType" type="string">
         <column name="CL_CHANGE_TYPE" length="6" not-null="true" />
      </property>
      <property name="nodeIdentifier" type="string">
         <column name="CL_NODE_IDENTIFIER" length="12"
            not-null="true" />
      </property>
      <property name="changeDate" type="timestamp">
         <column name="CL_CHANGE_DATE" length="7" not-null="true" />
      </property>
      <set name="changedFields" inverse="true" fetch="join"
         lazy="extra" cascade="save-update" batch-size="30">
         <key>
            <column name="CF_CHANGE_LOGS_FK" precision="22"
               scale="0" not-null="true" />
         </key>
         <one-to-many class="cdbu.dal.ChangedField" />
      </set>
   </class>
</hibernate-mapping>



Child mapping ( obtained through getChangedFields() method in the above example )

Code:

<hibernate-mapping>
   <class name="cdbu.dal.ChangedField" table="CD_CHANGED_FIELDS"
      dynamic-insert="true" mutable="false">
      <cache usage="read-only" />
      <composite-id name="id" class="cdbu.dal.ChangedFieldPk">
         <key-property name="fieldName" type="string">
            <column name="CF_FIELD_NAME" length="100" />
         </key-property>
         <key-many-to-one name="changeLog" class="cdbu.dal.ChangeLog"
            lazy="proxy">
            <column name="CF_CHANGE_LOGS_FK" precision="22"
               scale="0" />
         </key-many-to-one>
         <key-property name="order" type="byte">
            <column name="CF_ORDER" precision="1" scale="0" />
         </key-property>
      </composite-id>
      <property name="oldValue" type="string">
         <column name="CF_OLD_VALUE" length="4000" />
      </property>
      <property name="newValue" type="string">
         <column name="CF_NEW_VALUE" length="4000" />
      </property>
      <property name="languageCode" type="string">
         <column name="CF_LANGUAGE_CODE" length="10" />
      </property>
   </class>
</hibernate-mapping>



Hope that's enough.

_________________
--
Best Regards
Marcin Rzeźnicki


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 28, 2006 11:56 am 
Newbie

Joined: Fri Jun 16, 2006 3:41 pm
Posts: 18
It takes me 3 minutes per object, but I import 60 or so of them so It becomes a few hours for me as well.

I went through the sql output and I noticed I don't get any SELECT statements only insert and update. Update is used for one-to-many relationships for Set objects but not for one-to-many relationships for List objects. I'd like to get rid of those updates somehow maybe get an increase in speed.

I don't use composite keys and I use MS SQL Server express, I don't know if that makes a difference?

Obviously turning off the sql output reduces the time by around 40%, but I think you probably know that already.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 28, 2006 12:17 pm 
Newbie

Joined: Wed Sep 27, 2006 11:16 am
Posts: 6
Location: Poland
Budric wrote:
It takes me 3 minutes per object, but I import 60 or so of them so It becomes a few hours for me as well.


Oh so, I thought it was 3 minutes per import.

Quote:
I went through the sql output and I noticed I don't get any SELECT statements only insert and update. Update is used for one-to-many relationships for Set objects but not for one-to-many relationships for List objects. I'd like to get rid of those updates somehow maybe get an increase in speed.


In my case I didn't notice any updates, although I map relationship to set type.

Quote:
I don't use composite keys and I use MS SQL Server express, I don't know if that makes a difference?

Obviously turning off the sql output reduces the time by around 40%, but I think you probably know that already.


Yes, I know, but I think I should get rid of these selects first, if possible. They are really unnecessary in my case, because I create these objects and evict them from cache afterwards. They are not passed around between sessions. [/quote]

_________________
--
Best Regards
Marcin Rzeźnicki


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 28, 2006 5:22 pm 
Beginner
Beginner

Joined: Tue Sep 26, 2006 11:46 pm
Posts: 33
Budric wrote:
I went through the sql output and I noticed I don't get any SELECT statements only insert and update. Update is used for one-to-many relationships for Set objects but not for one-to-many relationships for List objects. I'd like to get rid of those updates somehow maybe get an increase in speed.


You can eliminate those updates by making the one-to-many mappings inverse and the mapping bi-directional which will seriously speed up your inserts.

The example below is from the docs:
http://www.hibernate.org/hib_docs/v3/re ... irectional

Note the inverse="true" on the parent collection and the fact the columns are the same for both ends of the collection.
Code:
<class name="Parent">
    <id name="id" column="parent_id"/>
    ....
    <set name="children" inverse="true">
        <key column="parent_id"/>
        <one-to-many class="Child"/>
    </set>
</class>

<class name="Child">
    <id name="id" column="child_id"/>
    ....
    <many-to-one name="parent"
        class="Parent"
        column="parent_id"
        not-null="true"/>
</class>


In order for this to work you will need to ensure that you set the child's reference to it's parent in the java code before saving the object, but it will get rid of those updates.

It also sounds like you should be using a Stateless Session
http://www.hibernate.org/hib_docs/v3/re ... esssession


Top
 Profile  
 
 Post subject:
PostPosted: Mon Oct 02, 2006 10:55 am 
Newbie

Joined: Fri Jun 16, 2006 3:41 pm
Posts: 18
I didn't want to change my whole java to accomodate bidirectional associations, but I read batch inserts in the chapter your linked to (set hibernate.jdbc.batch_size 5) and that increased the performance. The call to save is much much faster and then after 5 of these the flush/clear() calls update the database and take a while but still overall it's better.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 9 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.