-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: bulk load by primary keys
PostPosted: Tue May 03, 2005 2:41 pm 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
One of our DAO's takes an array of primary key integer id's and returns an array of objects. The actual objects being loaded are updated very rarely and hence we are making heavy use of the second level cache. In most cases a good portion if not all of the objects will exist in the cache. Currently, with the use of HQL, we always query the database with the full set of primary keys, even if some or all of the corresponding objects exist in the cache.

Instead we would rather, Hibernate only query the database for objects which do not exist in the cache. In the case all objects already exist in the cache, Hibernate could bypass the database query altogether. This would be like a session.load method which takes an array of primary keys.

Is this behavior currently possible? If not would it be something worth adding? Thanks.

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 4:04 am 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
session.load

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 7:46 am 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
I think the behavior I am looking for is a session.load which takes an array or collection of primary keys as an argument. This way Hibernate could reduce the amount of objects requested by looking in the second level cache. Then if any keys were not found in the cache, hibernate could make one request to the database using an in clause.

Hope that makes sense.

Thanks.

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 1:01 pm 
Hibernate Team
Hibernate Team

Joined: Sun Sep 14, 2003 3:54 am
Posts: 7256
Location: Paris, France
check cofr batch-size in the reference guide.

_________________
Emmanuel


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 1:55 pm 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
Batch fetching would definitely be an improvement but I don't think we are still getting optimal performance on a case by case basis. The number of entities we load can vary greatly in each instance, while the batch-size attribute is constant for the entire application.

The following is an example of a bulk load:

Suppose there is a request by primary keys for 500 entities. First we check the cache and find 250 of the entities are already cache. Now we make a database request for the remaining 250 entities with an 'in' clause.

The above scenario reduces the number of database requests to 1 and also greatly reduces the amount of data transferred from the database. Of course, this can only work if we are requesting objects by primary keys.

Hope that makes sense.

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 2:38 pm 
Hibernate Team
Hibernate Team

Joined: Thu Dec 18, 2003 9:55 am
Posts: 1977
Location: France
Quote:
First we check the cache and find 250 of the entities

what's wrong with requesting the cache first? you can do this manually and then remove id found in the cache.

Then a simple HQL query with setParameterList(updatedIdList) will work.

_________________
Anthony,
Get value thanks to your skills: http://www.redhat.com/certification


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 2:52 pm 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
I agree, as of today we can check the cache manually, but I am not convinced this is the best approach. One of the great features of Hibernate is that caching happens behind the seens without us having to code to any caching API. This is a great example of separating concerns and allows developers to optimize applications by modifying caching strategies instead of changing code.

A bulk load by primary keys seems like a performance optimization that everyone could benefit from. To me it seems like something which could be included as part of the Session API.

Thanks.

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 3:25 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Emmanuel is correct.

If you use load() and batch-size, you will get this behavior automatically.


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 3:27 pm 
Hibernate Team
Hibernate Team

Joined: Tue Aug 26, 2003 12:50 pm
Posts: 5130
Location: Melbourne, Australia
Hmmm. Well, perhaps not *quite* identical behavior but close. I can make it identical with a small change, if you *really* have a case where it would make a difference.

(The change would be to check for an EntityKey in the second-level cache before adding it to the batch.)


Top
 Profile  
 
 Post subject:
PostPosted: Wed May 04, 2005 4:38 pm 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
I agree, with a slight modification, I think batch size could give the desired behavior in this particular instance. The problem, however, is that the batch size setting is made on the object mapping on a session factory level. This does not permit the flexibility of using batch fetching in some instances and normal lazy fetching in other instances.

In our particular example, we have a find method on a DAO which takes an array of primary keys. In this case we would like to batch load all entities which do not yet exist the cache. In other cases, however, this same entity is referenced in many-to-one and one-to-many relationships. In these cases we would not like to employ batch fetching as it could cause the retreival of unneeded objects from the database.

I think what we really need is a way to programatically employ batch fetching when we want it. Perhaps an overloaded session.load method taking an array of keys would be a straight forward way to accomplish these batch loads.

Thanks for your help!

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 05, 2005 9:59 pm 
Newbie

Joined: Thu May 05, 2005 9:50 pm
Posts: 1
We too have the same requirement as mentioned by Karl.
Basically an ability to give a list of primary keys and get back the persistent objects. In case some of the persistent objects exist in the key, Hibernate should only go to the database for the ones which do no exist in the cache.

Batch fetching may not always be ideal. In case of relationships, it could result in a retrieval of unnecessary data.


Thanks
Ravi


Top
 Profile  
 
 Post subject:
PostPosted: Thu May 26, 2005 9:42 am 
Beginner
Beginner

Joined: Tue Aug 17, 2004 11:27 pm
Posts: 24
This topic has been quiet for the last few weeks, but I still think this is a feature worthy of our attention.

Batch-size does not give us what we need since it is a global setting on the session factory and we cannot use it on a case by case basis.

Doing direct lookups in the cache will work, but caching is a responsibility we would rather leave to Hibernate.

To me an overloaded session.get method taking a Class and an array of Serializable would be a straightforward self documenting method which would solve this issue.

Any more thoughts?

Thanks.

-karl


Top
 Profile  
 
 Post subject:
PostPosted: Mon Dec 19, 2005 6:01 pm 
Beginner
Beginner

Joined: Wed Feb 09, 2005 3:27 pm
Posts: 29
I agree with Karl and Ravi; this would be a good addition to the API.

Suggestion:

Session.load(String entityName, Serializable[] ids, LockMode lockMode)
Session.load(String[] entityNames, Serializable[] ids, LockMode lockMode)

Any chance this could be added?

Thanks,

-Kaare


Top
 Profile  
 
 Post subject:
PostPosted: Fri Nov 28, 2008 6:54 am 
Newbie

Joined: Tue Nov 18, 2008 4:12 am
Posts: 8
Location: Singapore
I vote for this feature too!

But my requirements differ from Karl's slightly. I need the entities to be cached in the 1st-level cache, not the 2nd-level cache. I guess this can be done simply by disabling 2nd-level caching for that entity class.

I have a batch processing loop that performs several loads per iteration, so performing a bulk load beforehand will mean having a few database hits before the loop starts, instead of a few database hits per iteration in a million-strong loop. I'm presently writing my own Criteria query to do this, but having a load(entity, keys) method will much, much easier.

Also, I'll rather have lock mode as an optional argument. Some developers (like me) won't know the best lock mode to use, and will rather have Hibernate make the decision for them (me).

_________________
Life at i-flex turns me on!


Top
 Profile  
 
 Post subject:
PostPosted: Fri Nov 28, 2008 7:02 am 
Newbie

Joined: Tue Nov 18, 2008 4:12 am
Posts: 8
Location: Singapore
[duplicate post]


Last edited by dazzwater on Mon Dec 01, 2008 3:00 am, edited 2 times in total.

Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
cron
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.