Slave/Master JMS Configuration Questions

ronotica · **Joined:** Fri Feb 18, 2011 7:30 pm **Posts:** 41

Hi, let me start off by saying I know the slave/master route wasn't designed for some of these, but I wanted to know how far we could stretch it.

Here are the facts:

Now my questions:
1) Can we set the refresh period to 60 secs? (We are yet to run a load test which will ultimately decide, but I was just curious). This will help us achieve near realtime search results
2) I know it is preferred for each slave node to have a their own local copy of the index, but how problematic (if at all) is it to have all slave nodes reading from a single NFS location?

As an alternative to achieve near realtime search, we are looking into in memory solutions. That is still in the works.

Thanks

sanne.grinovero · **Posted:** Thu Apr 07, 2011 7:05 pm

Quote:

Each lucene document consists of a diverse set of data, about 200+ fields/terms

fields or terms? quite different.

Quote:

The database that the index is mapped to is updated continuously by thousands of users

that's normal :)

Quote:

Most changes in the database occurs outside the knowledge of search (we have a work around for this already)

Can you detail this better? what do you mean with "outside the knowledge" ? and in all cases I'd like to know about work arounds. Please see also the changes in 3.4.0.CR2 which provide some mayor performance boosts, especially around dirty checking optimizations and collection updates triggering the minimal set of needed updates to be send to the index.

Quote:

Most changes in the database occurs outside the knowledge of search (we have a work around for this already)

You can get full real time results using the Infinispan backend, especially as your index is so small it will also provide a nice performance boost, it's usually better than the FS based solutions, especially for frequently updated indexes (near real time updates being applied to the index).

Quote:

2) I know it is preferred for each slave node to have a their own local copy of the index, but how problematic (if at all) is it to have all slave nodes reading from a single NFS location?

That's extremely problematic. I've found just a couple of people which asserted that they had a new improved NFS version which could run Lucene without problems, only to get them complaining a month later that all their indexes suddenly where corrupted by some bad luck, and blogging like crazy that you should never try it because of the insane optimizations being done in Lucene. you've been warned :)

ronotica · **Joined:** Fri Feb 18, 2011 7:30 pm **Posts:** 41

Sorry for the confusion.

200+ fields.

We have 2 applications hitting the database. One is a new application, with hibernate search mappings & annotations and the other is a legacy system. The new application is a read mostly application with only a tiny bit of pieces doing updates to the database. If those updates are triggered from the new application, indexing is taken care of for us since search handles that transparently. On the other hand, the legacy system is where 90% of the database updates are triggered. It has no knowledge of search, but it can push a message to the master node when updates occur so that the relevant index updates can occur. I hope that is clear.

I was actually looking at your post on 3.4.0.CR2 today and it is definitely on my plate to try. Any performance optimization is definitely welcomed.

As I said, in memory cache solution is on the table as another alternative, so I'll definitely be looking into Inifispan as well.

As for the NFS share location, I had a gut feeling based on all I have read about how Lucene doesn't play well with that, but I had to hear from the horse's mouth :)

Now I feel better.