-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Distribute Hibernate Search
PostPosted: Tue Mar 23, 2010 11:27 am 
Newbie

Joined: Wed Nov 18, 2009 7:00 pm
Posts: 12
As my dataset grows >1 billion documents, I need to distribute indexing and searching of my lucene indexes. Currently I implement a sharding strategy, but all the processing is done within one jvm on one machine. Is there a way to spread the shards to different jvm's on different machines?


Thanks in advance.


Top
 Profile  
 
 Post subject: Re: Distribute Hibernate Search
PostPosted: Wed Mar 24, 2010 4:19 am 
Pro
Pro

Joined: Wed Oct 03, 2007 2:31 pm
Posts: 205
I would definitely look at Infinispan (if i had the requirement for distributed search). I downloaded it a couple of weeks ago and started to have a play around with it and it very good. There is a Lucene directory provider that comes with the distribution so it would be worth having a look. I do believe there will be an integration with Hibernate Search and Infinispan (Sanne Grinovero is the best person who can comment on this).

Another option is the master slave configuration. This configuration allows a single master who is responsible for adding documents to the index while slave nodes have a read only copy of the index to perform searches. When the slave nodes update an object there is a message that is placed on a queue for the master process to index the document. Periodically the master index changes are pushed to the slave node. The online documentation or Hibernate Search in Action provides more indepth details.

Hope that helps.


Top
 Profile  
 
 Post subject: Re: Distribute Hibernate Search
PostPosted: Wed Mar 24, 2010 6:04 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
hi amin-mc, thanks for the nice introduction.
True, we're going to create a new module for Hibernate Search to integrate the new distributed Directory, it's scheduled for beta3 of 3.2.0, which is the next iteration.

The Infinispan Directory is meant to cope with huge indexes which need easy and reliable distribution across several nodes, especially suited for interactive updates as Hibernate Search typically does. But in case you have many nodes wanting to write to the index at same time you still will have contention issues, the best design is to combine the master/slave setup with an Infinispan Directory, so I'd suggest you to configure the architecture for that design and then eventually switch the Directory when your dataset will demand it. Combine it all with sharding if possible to better distribute the load.

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.