Hibernate Community • View topic - walking a large result set

View unanswered posts | View active topics

Board index » Hibernate & Java Persistence » Hibernate Users

All times are UTC - 5 hours [ DST ]

walking a large result set

Page 1 of 1

[ 6 posts ]

Previous topic | Next topic

Author

Message

rickla

Post subject: walking a large result set

Posted: Mon Sep 22, 2003 4:37 pm

Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29

I've got 3million records to walk through, some which I will discard, others not.

I was using find() with the property for fetchsize enabled, but found that hibernate seems to load all records first then I get my list. (wish that fetchsize were on a per query basis, but good enough for now!)

So I looked at getting ScrollableResults from the query object.

But with this interface, I don't see a method to get a hydrated object, and it looks like it won't put things in the cache etc, am I correct?

Is there an api inside that I can get to so I can hydrate from these results, or is there another alternative?

It would be very useful to have a iterator implementation in find() that "wrapped" the loop in find(), so each next would hydrate from the result set with caching etc.

Bottom line is that I want to iterate through these objects as I process, and not take up all that memory. Possible? Iterate is out because I'd be doing one call for each row....

rick

Top

steve

Post subject:

Posted: Mon Sep 22, 2003 6:02 pm

Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX

Somewhere before 3 million, I would seriously consider whether an O/R tool is the correct tool for this job. O/R tools are not really intended for mass processing of this magnitude.

But if you choose to go forward with this approach, then you pretty much have the two options you specified: load everything up front using session.find()/query.list() or load "on demand" using session.iterate()/query.iterate(). You might try something using query.setFirstResult()/query.setMaxResults() and query.list(). However, the underlying JDBC driver used will heavily effect the type of performance gain, if any, will result (it may even be a bigger performance hit in some cases).

Top

rickla

Post subject:

Posted: Mon Sep 22, 2003 7:56 pm

Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29

I pretty much thought I just had those options, just wanted to be sure I didn't miss anything else.

I kind of disagree about the O/R comment, maybe it's just not hibernate's goal. It still will come down to me using objects for my data handling.

Even if hibernate could be used to hydrate things from our own rowsets, that would be very handy. If I did go straight jdbc, I'll be hand loading up the same objects I've always done before, and that would be a shame!

I still may take the approach of not doing all this preloading, it's just how it's been done in the existing code.

And regardless of the data size, I think it would have been better to make the api return an iterator (or have a variant that did), since this more resembles the streaming reading that you often have to do with db results.

Top

prophecy

Post subject:

Posted: Mon Sep 22, 2003 8:22 pm

Beginner

Joined: Sat Aug 30, 2003 1:36 am
Posts: 47
Location: Calgary, AB

This sounds like something I brought up a while ago, you can read all about it here:

http://sourceforge.net/forum/message.php?msg_id=1985251

See the message about using evict(). Remember sourceforge's forums are really lame so the thread actually starts at the bottom and goes up.

Top

steve

Post subject:

Posted: Mon Sep 22, 2003 8:47 pm

Hibernate Team

Joined: Tue Aug 26, 2003 3:00 pm
Posts: 1816
Location: Austin, TX

Your initial post implies that you are worried about performance and/or memory usage. If you know of an O/R mapping tool out there that is geared towards handling that much data serially, please let me know... I'd love to have a look.

Top

rickla

Post subject:

Posted: Mon Sep 22, 2003 9:36 pm

Beginner

Joined: Thu Sep 04, 2003 5:38 pm
Posts: 29

I rolled my own, and in the past I've used objectstore, but that was a long time ago. More recently vbsf, but we wanted to get away from that for a bunch of other reasons (before I came onboard).

Hibernate isn't very far!! What I think is preventing it is assuming the data is a list, and therefore all in memory at once. The find api returning an iterator would allow object based batch processing pretty easily.

I'm a bit biased on this type of design, I prefer pipelined data because I've worked on processing object transforms in the past.

But again I am sort of hoping to just avoid the preload, but wanted to implement it this way first.

Top

Page 1 of 1

[ 6 posts ]

Board index » Hibernate & Java Persistence » Hibernate Users

All times are UTC - 5 hours [ DST ]

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum