-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 
Author Message
 Post subject: walking a large result set
PostPosted: Mon Sep 22, 2003 4:37 pm 
Beginner
Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29
I've got 3million records to walk through, some which I will discard, others not.

I was using find() with the property for fetchsize enabled, but found that hibernate seems to load all records first then I get my list. (wish that fetchsize were on a per query basis, but good enough for now!)

So I looked at getting ScrollableResults from the query object.

But with this interface, I don't see a method to get a hydrated object, and it looks like it won't put things in the cache etc, am I correct?

Is there an api inside that I can get to so I can hydrate from these results, or is there another alternative?

It would be very useful to have a iterator implementation in find() that "wrapped" the loop in find(), so each next would hydrate from the result set with caching etc.

Bottom line is that I want to iterate through these objects as I process, and not take up all that memory. Possible? Iterate is out because I'd be doing one call for each row....

rick


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2003 6:02 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
Somewhere before 3 million, I would seriously consider whether an O/R tool is the correct tool for this job. O/R tools are not really intended for mass processing of this magnitude.

But if you choose to go forward with this approach, then you pretty much have the two options you specified: load everything up front using session.find()/query.list() or load "on demand" using session.iterate()/query.iterate(). You might try something using query.setFirstResult()/query.setMaxResults() and query.list(). However, the underlying JDBC driver used will heavily effect the type of performance gain, if any, will result (it may even be a bigger performance hit in some cases).


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2003 7:56 pm 
Beginner
Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29
I pretty much thought I just had those options, just wanted to be sure I didn't miss anything else.

I kind of disagree about the O/R comment, maybe it's just not hibernate's goal. It still will come down to me using objects for my data handling.

Even if hibernate could be used to hydrate things from our own rowsets, that would be very handy. If I did go straight jdbc, I'll be hand loading up the same objects I've always done before, and that would be a shame!

I still may take the approach of not doing all this preloading, it's just how it's been done in the existing code.

And regardless of the data size, I think it would have been better to make the api return an iterator (or have a variant that did), since this more resembles the streaming reading that you often have to do with db results.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2003 8:22 pm 
Beginner
Beginner

Joined: Sat Aug 30, 2003 1:36 am
Posts: 47
Location: Calgary, AB
This sounds like something I brought up a while ago, you can read all about it here:

http://sourceforge.net/forum/message.php?msg_id=1985251

See the message about using evict(). Remember sourceforge's forums are really lame so the thread actually starts at the bottom and goes up.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2003 8:47 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX
Your initial post implies that you are worried about performance and/or memory usage. If you know of an O/R mapping tool out there that is geared towards handling that much data serially, please let me know... I'd love to have a look.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Sep 22, 2003 9:36 pm 
Beginner
Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29
I rolled my own, and in the past I've used objectstore, but that was a long time ago. More recently vbsf, but we wanted to get away from that for a bunch of other reasons (before I came onboard).

Hibernate isn't very far!! What I think is preventing it is assuming the data is a list, and therefore all in memory at once. The find api returning an iterator would allow object based batch processing pretty easily.

I'm a bit biased on this type of design, I prefer pipelined data because I've worked on processing object transforms in the past.

But again I am sort of hoping to just avoid the preload, but wanted to implement it this way first.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 6 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.