-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 
Author Message
 Post subject: OutOfMemoryError after 100k inserts
PostPosted: Sat Feb 11, 2006 5:29 am 
Newbie

Joined: Sat Feb 11, 2006 5:11 am
Posts: 3
Location: España
Hello, i am using Hibernate and Spring, and when i do more than 100k inserts, i get an java.lang.OutOfMemoryError

Hibernate version: 2.1.7

My Code is:
Code:
for (int cont=0; cont<1000000; cont++){
   
   Song objSong3=new Song();               
   objSong3.setStrTitle("Song_"+cont);
         
   System.out.println("Song_"+cont);

   getHibernateTemplate().save(objSong3);
         
   if ( cont % 100 == 0 ) {
      getHibernateTemplate().flush();
      getHibernateTemplate().clear();   
   }
}


The other code that i'm testing is:
Code:
Session session = getSessionFactory().openSession();
for (int cont=0; cont<1000000; cont++){
   
   Song objSong3=new Song();               
   objSong3.setStrTitle("Song_"+cont);
            
   System.out.println("Song_"+cont);
   session .save(objSong3);
         
   if ( cont % 100 == 0 ) {
      session.flush();
      session.clear();   
   }
}


The only code that runs more or less well is:
Code:
Session session = getSessionFactory().openSession();
for (int cont=0; cont<1000000; cont++){
   
   Song objSong3=new Song();               
   objSong3.setStrTitle("Song_"+cont);
         
   System.out.println("Song_"+cont);

   session .save(objSong3);
            
   if ( cont % 100 == 0 ) {
      session.flush();
      session.clear();
      session.close();
      session.openSession();   
   }
}



But i don't like thisone very much. Because i do not want to open and close sessions so many times. And i preger thisone: getHibernateTemplate().


Mapping documents:
hibernate-mappings.hbm.xml
Code:
<class name="com.works.model.Work" table="WORK" dynamic-update="false" dynamic-insert="false" mutable="true" polymorphism="implicit" batch-size="1" select-before-update="false" optimistic-lock="version">
    <id name="IIdWork" column="idWORK" type="java.lang.Integer" unsaved-value="null">
      <generator class="native"/>
    </id>
    <property name="strTitle" type="java.lang.String" update="true" insert="true" column="TITLE" not-null="true" unique="false"/>
    <property name="dtDate" type="java.util.Date" update="true" insert="true" column="DATE" not-null="false" unique="false"/>
    <property name="strKind" type="java.lang.String" update="true" insert="true" column="KIND" not-null="true" unique="false"/>
     <joined-subclass name="com.works.model.Song" dynamic-update="false" dynamic-insert="false">
      <key column="idWORK"/>
      <property name="strGenre" type="java.lang.String" update="true" insert="true" column="GENRE" not-null="false" unique="false"/>
      <property name="strTime" type="java.lang.String" update="true" insert="true" column="TIME" not-null="false" unique="false"/>
    </joined-subclass>
  </class>



applicationContext.xml
Code:
<bean id="songdao" class="com.works.dao.hibernate.SongDAOHibernate">
      <property name="sessionFactory"><ref bean="sessionFactory"/></property>
   </bean>

   <bean id="SongDAO" class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean">
      <property name="transactionManager"><ref bean="transactionManager"/></property>
      <property name="target"><ref local="songdao"/></property>
      <property name="transactionAttributes">
         <props>
            <prop key="*">PROPAGATION_REQUIRED</prop>
         </props>
      </property>
   </bean> 



Name and version of the database you are using:
MySQL - version 4.1

Thank for all.[/i][/b]


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 11, 2006 5:44 am 
Newbie

Joined: Fri Feb 10, 2006 5:48 am
Posts: 7
Can u try doing the same using transaction
like the following code snippet

Session session = getSessionFactory().openSession();
Transaction trans = session.beginTransaction();
for (int cont=0; cont<1000000; cont++){

Song objSong3=new Song();
objSong3.setStrTitle("Song_"+cont);

System.out.println("Song_"+cont);
session .save(objSong3);


}
trans.comit();
session.close();
}


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 11, 2006 6:09 am 
Newbie

Joined: Sat Feb 11, 2006 5:11 am
Posts: 3
Location: España
The code that you have post is the other test that i have try, but the result is the same.

Thank you.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 11, 2006 3:18 pm 
Regular
Regular

Joined: Sun May 08, 2005 2:48 am
Posts: 118
Location: United Kingdom
I think this is level 1 cache vs memory exhaustion, with the object still tied to the open transaction because they are dirty...

The flush() and clear() may not even be needed.

Try:

Code:
Transaction tx = session.beginTransaction();
for(int cont=0; cont<1000000; cont++){
session.save(...);
if((iteration % 100) == 0) {
  session.flush();
  tx.commit();
  session.clear();
}
}
tx.commit();


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 12, 2006 5:46 am 
Newbie

Joined: Sat Feb 11, 2006 5:11 am
Posts: 3
Location: España
Thanks dlmiles.

I tryed your code, and it worked fine. But one question: which option is better? Working with session or with getHibernateTemplate, and why?

All my application uses getHibernateTemplate because it is eassier and i would like it to use the best option. Could you please tell me which is the best?

Thanks for all.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 12, 2006 3:40 pm 
Regular
Regular

Joined: Sun May 08, 2005 2:48 am
Posts: 118
Location: United Kingdom
JJavier wrote:
I tryed your code, and it worked fine. But one question: which option is better? Working with session or with getHibernateTemplate, and why?

All my application uses getHibernateTemplate because it is eassier and i would like it to use the best option. Could you please tell me which is the best?


I dont' understand, I'm no Spring expert but isn't getHibernateTemplate() an access method provided by Spring to get access to your Session object from the context you are in?

So isn't that a bit like asking me: Working with Car or with Porsche 911, and why ? They are both essentially Cars.


The problem here maybe that you need to understand how Spring controls the transaction state from the getHibernateTemplate. Maybe you can:

Code:
Session session = getHibernateTemplate().getSession();
Transaction tx = session.getTransaction();  // Needs Hibernate 3.0.0 or above
...
for(int i = 0; i < 100000;i ++) {
   ...
}
tx.commit();


There maybe some gotchas you need to check, for example the option:

hibernate.transaction.auto_close_session=false

I suspect spring already does the above as I think you would have already seen an Exception when you tried my suggestion the first time.

If you are using Hibernate 3.1.2 I think there is a StatelessSession object that might be useful for this type of bulk insert, maybe this link is of some help; http://www.hibernate.org/hib_docs/v3/re ... batch.html


If my reply is useful, "don't forget to rate" otherwise my karma gets upset. Peace man!


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 12, 2006 8:09 pm 
Beginner
Beginner

Joined: Thu Nov 03, 2005 4:11 pm
Posts: 25
I am having a similar problem. I have several long running processes that perform many queries but none or very few inserts/updates/deletes. If I profile the process it quickly slows down. Each iteration gets slower and slower. THis process will take 40 minutes to run, more often then not I gent an out of memory error. Now if I commit the transaction after each iteration it takes 2 minutes to run.

It seems like the 1st level cache gets jammed up fairly easily. Does anyone have any suggestions for handling this or why this occurs?


Top
 Profile  
 
 Post subject:
PostPosted: Mon Feb 13, 2006 1:43 am 
Regular
Regular

Joined: Sun May 08, 2005 2:48 am
Posts: 118
Location: United Kingdom
bluesky wrote:
If I profile the process it quickly slows down. Each iteration gets slower and slower. THis process will take 40 minutes to run, more often then not I gent an out of memory error. Now if I commit the transaction after each iteration it takes 2 minutes to run.

It seems like the 1st level cache gets jammed up fairly easily. Does anyone have any suggestions for handling this or why this occurs?


I'm sure that sort of performance change would be of interest to the development team.

If your whole application works under a single huge transaction then that is a lot of data to keep track of, maybe your application doesn't need such a huge batch of SQL ?

First is your performance change due to OS swapping ? If you are using a lot of ram and the footprint is bigger than what your machine has installed ?


Top
 Profile  
 
 Post subject:
PostPosted: Mon Feb 13, 2006 12:50 pm 
Beginner
Beginner

Joined: Thu Nov 03, 2005 4:11 pm
Posts: 25
dlmiles wrote:
I'm sure that sort of performance change would be of interest to the development team.

If your whole application works under a single huge transaction then that is a lot of data to keep track of, maybe your application doesn't need such a huge batch of SQL ?

First is your performance change due to OS swapping ? If you are using a lot of ram and the footprint is bigger than what your machine has installed ?


There is an accounting side to my software and I have a couple of audit processes which will go through all the open accounts. These audit processes run manually when the user wants to run them. It looks through many layers of objects for each account. Throughout this proces it is composing a result set of problem data.

I played with this some more and found i get the same speed increase just by calling session.clear, sesson.flush after each account is audited. So the only problem with breaking it up into lots of new sessions is that i believe I am detatching all my objects.

I find it odd that this is necessary though. It doesn't really seem like that much data to me though. Is a way i could measure exactly how many objects are open in a session at any given time?


Top
 Profile  
 
 Post subject:
PostPosted: Mon Feb 13, 2006 4:06 pm 
Regular
Regular

Joined: Sun May 08, 2005 2:48 am
Posts: 118
Location: United Kingdom
bluesky wrote:
It looks through many layers of objects for each account. Throughout this proces it is composing a result set of problem data.

I played with this some more and found i get the same speed increase just by calling session.clear, sesson.flush after each account is audited. So the only problem with breaking it up into lots of new sessions is that i believe I am detatching all my objects.

I find it odd that this is necessary though. It doesn't really seem like that much data to me though. Is a way i could measure exactly how many objects are open in a session at any given time?


First I dont think you are "Detaching" your objects, because your Session is never closed on them and you never use those objects with a different Session instance.

Second the session.clear() will flush all 1st level cached object that are not dirty. So anything you touched read-only can be dropped, since lazy loading can also go and get it again (from SQL). But proxyied objects will still work (providing the Session stays open)

Third maybe you want to consider how you get your 1000000 account record list. I don't think you documented exactly how you did it. Generally speaking if you use the Query.list() (or similar HQL interface) all 1 million object might be loaded from the SQL server into RAM first before you get to work on your first object. This does not scale, However if you use an integral ID number and blindly start at 1 and issue a Session.get(Account.class, Integer.valueOf(thisId)); then you may not be subject to this bottleneck. Maybe you can chunk your HQL.


It sounds like there is a big difference between the data set size for:

* Object count touched to calculate if we are interested in this account for adding to the result set.
* Object count necessary to remember all those accounts we did find interesting.


I originally thought you were modifying a large amount of data, and I suspect you are modifying something on the way, otherwise I think tx.commit() would be unecessary and Session.flush() and Session.clear() would have done the trick.


Different people would have different opinions on if a possibly 1 million record result set is too big or not to work from ram. Maybe if you have 4Gig of ram and the only purpose of the application server is to service that one user then you might not consider your method outrageous.

Someone else might take the tact that if we only want to know the problem account numbers (or some other such reference) then in effect evey account is inspected to find out if we are interested, we simply record the accountId number to form our minimal result set (during a 1st pass).

As you can guess this is lighter weight result set which you can (if you really needed too) offload into a file and do away with any real RAM overhead (for real huge batch jobs).


I generally find Enterprise computing to be a trade off between resources and it sounds like might want to stand back a moment and make a decision as to what your environment parameters are. For me this might mean that any one of 30 users might run this process at any time of the working day, two people might run this process at the same time, so I might put a 64Mb limit on that task and then work out my algorithm with that in the specification. Since I know there are upto 90 users on the same box with maybe only 10 active at once. These metrics directly impact my choice of algorithm for that task.

I rarely find any "batch" job can be programmed in a simplistic / naive way, as the metrics dont scale well or allow fair use with other parts of the systems that share those resources at the same time.

HTH YMMV


Top
 Profile  
 
 Post subject:
PostPosted: Mon Feb 13, 2006 6:31 pm 
Beginner
Beginner

Joined: Thu Nov 03, 2005 4:11 pm
Posts: 25
Thanks for your thoughts..


Your question of object count touched vs object count necessary is a good one. It seems like this might be an issue.

I do retrieve my account list from a very constrained query that essentially looks for accounts with open invoices. The test runs i was working on only worked with 300 accounts. This seems trivial to me. But each account is attached to a wide range of data that I must sift through to determine the audit procedure. Still I would think that it isn't anywhere close to a million rows of data, i don't even think it is a 100000. I guess this is the place where it is questionable whether I am touching a lot of unnecessary data. I tried to do the session.getStatistics to measure this but I keep getting an Unsupported exception.
Are there settings i might have set incorrect that are causing more of the object graph to be retrieved then necessary? I think i have everything set to lazy loading, but this just seems like the place to look.

It doesn't seem like that much data to me. I am using hibernate with spring and was trying to fix this declaratively to keep any yucky performance related code out of my clean code base. So that is why i first tried closing transactions after each account was audited. I then went the the session.flush/clear method which does do the trick.

Anyways this seems to be an anomoly because the application i am developing is pretty big and it is only on these 2 processes where I am experiencing the problem. I just don't know where to look to find the heart of the problem at this point.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 11 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.