-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 
Author Message
 Post subject: Updating a single value across a large number of Documents
PostPosted: Thu Mar 29, 2012 12:37 pm 
Pro
Pro

Joined: Wed Nov 05, 2003 7:22 pm
Posts: 211
The scenario is this. Imagine a Facebook like Wall where users can post thousands of messages and with each post their image is shown. This image filename is stored in a field with the Post document. Now imagine the user uploading a new photo.

The standard way to make sure that photo field gets updated is to have a OneToMany relation between the User and the Post Entity. And the corresponding Post Entity having a ContainedIn for the User reference.
But do you really want to instantiate thousands of objects in order to update one field? And do you want to risk Hibernate's magic from deciding it needs to instantiate this collection ever?

The Massindexer and the corresponding filter functionality HSEARCH-499, hopefully part of 4.1.0.Final, offer a good alternative to this scenario. You can just run the MassIndexer against the Post index filtered on the user. However, that will still generate quite a lot of database traffic.

So I'm wondering if a more fine tuned, lower level approach is possible? Just updating the single field in one fell swoop across all off the Lucene Documents that match user.id:x. In Lucene I believe this is achieved by retrieving the document, deleting the field and adding it with the new value.

Does dit mash with Hibernate Search?

Cheers,
Marc


Top
 Profile  
 
 Post subject: Re: Updating a single value across a large number of Documents
PostPosted: Thu Mar 29, 2012 7:21 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Hi Marc,
well that would be really, really nice. But Lucene doesn't allow you to update a single field: you'll need the original (pre-tokenized) values for all the fields, and rebuild all the Documents instances.

So very fine grained is not possible, but you can still get some good efficiency:

Quote:
But do you really want to instantiate thousands of objects in order to update one field? And do you want to risk Hibernate's magic from deciding it needs to instantiate this collection ever?

Enable a second level cache. You should really have all relations lazily loaded, if possible, and with a second level cache you'll not reload the same value more than once. You'll generate some temporary objects in memory, but they are very short lived and the GC knows how to optimize that. Unfortunately Lucene requires all the objects, so that's what we feed it.

Quote:
The Massindexer and the corresponding filter functionality HSEARCH-499, hopefully part of 4.1.0.Final

Could you explain that? I'm missing something..

Lucene developers are working on a special kind of join which would enable you to split a Document in two (but not more than two), so we'll need to update only one of the document parts. When that will be ready, we'll think how to expose it in some way which makes sense without the user having to know all the Lucene index format details.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 2 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.