-->
These old forums are deprecated now and set to read-only. We are waiting for you on our new forums!
More modern, Discourse-based and with GitHub/Google/Twitter authentication built-in.

All times are UTC - 5 hours [ DST ]



Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 
Author Message
 Post subject: Slave/Master JMS Configuration Questions
PostPosted: Thu Apr 07, 2011 3:21 pm 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Hi, let me start off by saying I know the slave/master route wasn't designed for some of these, but I wanted to know how far we could stretch it.

Here are the facts:
    Our index size is about 1.5 - 2GB
    Each lucene document consists of a diverse set of data, about 200+ fields/terms
    The database that the index is mapped to is updated continuously by thousands of users
    Users want near realtime search results that reflect the DB updates
    Most changes in the database occurs outside the knowledge of search (we have a work around for this already)
    The environment is clustered (websphere), and multiple slave nodes reading their local copies and sending updates to the jms queue for the master to work on

Now my questions:
1) Can we set the refresh period to 60 secs? (We are yet to run a load test which will ultimately decide, but I was just curious). This will help us achieve near realtime search results
2) I know it is preferred for each slave node to have a their own local copy of the index, but how problematic (if at all) is it to have all slave nodes reading from a single NFS location?

As an alternative to achieve near realtime search, we are looking into in memory solutions. That is still in the works.

Thanks


Top
 Profile  
 
 Post subject: Re: Slave/Master JMS Configuration Questions
PostPosted: Thu Apr 07, 2011 7:05 pm 
Hibernate Team
Hibernate Team

Joined: Fri Oct 05, 2007 4:47 pm
Posts: 2536
Location: Third rock from the Sun
Quote:
Each lucene document consists of a diverse set of data, about 200+ fields/terms

fields or terms? quite different.

Quote:
The database that the index is mapped to is updated continuously by thousands of users

that's normal :)

Quote:
Most changes in the database occurs outside the knowledge of search (we have a work around for this already)

Can you detail this better? what do you mean with "outside the knowledge" ? and in all cases I'd like to know about work arounds. Please see also the changes in 3.4.0.CR2 which provide some mayor performance boosts, especially around dirty checking optimizations and collection updates triggering the minimal set of needed updates to be send to the index.

Quote:
Most changes in the database occurs outside the knowledge of search (we have a work around for this already)

You can get full real time results using the Infinispan backend, especially as your index is so small it will also provide a nice performance boost, it's usually better than the FS based solutions, especially for frequently updated indexes (near real time updates being applied to the index).

Quote:
2) I know it is preferred for each slave node to have a their own local copy of the index, but how problematic (if at all) is it to have all slave nodes reading from a single NFS location?

That's extremely problematic. I've found just a couple of people which asserted that they had a new improved NFS version which could run Lucene without problems, only to get them complaining a month later that all their indexes suddenly where corrupted by some bad luck, and blogging like crazy that you should never try it because of the insane optimizations being done in Lucene. you've been warned :)

_________________
Sanne
http://in.relation.to/


Top
 Profile  
 
 Post subject: Re: Slave/Master JMS Configuration Questions
PostPosted: Thu Apr 07, 2011 7:45 pm 
Beginner
Beginner

Joined: Fri Feb 18, 2011 7:30 pm
Posts: 41
Sorry for the confusion.

200+ fields.

We have 2 applications hitting the database. One is a new application, with hibernate search mappings & annotations and the other is a legacy system. The new application is a read mostly application with only a tiny bit of pieces doing updates to the database. If those updates are triggered from the new application, indexing is taken care of for us since search handles that transparently. On the other hand, the legacy system is where 90% of the database updates are triggered. It has no knowledge of search, but it can push a message to the master node when updates occur so that the relevant index updates can occur. I hope that is clear.

I was actually looking at your post on 3.4.0.CR2 today and it is definitely on my plate to try. Any performance optimization is definitely welcomed.

As I said, in memory cache solution is on the table as another alternative, so I'll definitely be looking into Inifispan as well.

As for the NFS share location, I had a gut feeling based on all I have read about how Lucene doesn't play well with that, but I had to hear from the horse's mouth :)

Now I feel better.


Top
 Profile  
 
Display posts from previous:  Sort by  
Forum locked This topic is locked, you cannot edit posts or make further replies.  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
© Copyright 2014, Red Hat Inc. All rights reserved. JBoss and Hibernate are registered trademarks and servicemarks of Red Hat, Inc.