Hi,
welcome.
Quote:
First two points are fairly achieved as soon as I integrated Hibernate Search with my project. I am anticipating a problem with 3rd point. We were planning to use shared index file on NAS file system. After some reading I have found that for clustered environment we can run into index file locking issue for synchronous updates and to avoid it we can use:
That's right, I'd strongly advise to avoid NAS or any other networked filesystem approaches.
Quote:
1. JMS (Master/Slave)
2. JGroups (Master/Slave)
3. Database to store indexes.
Beware some confusion, JMS/JGroups are alternatives for the Master/Slave connection, but with both alternatives you have to share the indexes across nodes in some way. So any of these combinations works
For the Master/Slave:
1. JMS
2. JGroups
3. Something custom
For the index Storage:
a. Using the provided Directory Providers which create copies over network (the rsync like approach found in FSMasterDirectoryProvider and FSSlaveDirectoryProvider(
b. Using the Infinispan Directory provider (read more
here.)
c. Others (create your own copy-over index strategy - rsync is known to work well with Lucene)
You'll need either 1+a, or 1+b, or 2+a or 2+b, or roll your own solution: all involved services have hooks to plug in custom implementations.
Quote:
However, we are not sure if our production environment will have WebSphere or Jboss.
I don't think using Websphere or JBoss should prevent you to choose any option, they should work fine both on each one; I've no experience with WebSphere & Search but I'd expect both JGroups and JMS to work fine on it - if not I'd love to hear more about it, suggestions and patches are always welcome: we want to support all application servers, it's just hard to test all of them without help.
Quote:
Also I think db index storage will make the search pretty slow.
Agreed; in fact we don't support this option unless you enable the Infinispan Directory and configure it to write through to a database, that's fine as it will aggressively cache as much as possible, and even be able to write async while exposing changes real time to other nodes. I guess this is the best solution for you requirements.
Quote:
The first 2 options also make the search index update to be asynchronous which is not desired.
While usually people use JMS in an async way as some milliseconds delay is acceptable, you're not forced to do that and it should work fine in sync too, just slower.
Quote:
I think the wait time goes to 60 minutes.
No the periodicity of updating the indexes via the FSMasterDirectoryProvider (and slave) can be configured to much lower timings. Still making real time guarantees is not possible, as index files copying via network file systems is involved: depending on your environment a couple of minutes should be allowed.
Quote:
1. Is there any way other than Master/Slave configuration to get away with index file locking?
No, that's an intrinsic limitation of Lucene: only one node can write to the index at the same time. (and one of it's strong points too as it's one of the reasons of the great search performance: it's a design tradeoff)
Quote:
2. Does master/slave configuration become a pain in long run?
As far as I know, it's not a pain in the long run as many users and customers are happy with it. I admit it's a pain to setup initially, and requires some time to test all the aspects involved. I hope to make it easier in future with a stronger integration of JGroups & Infinispan, your feedback about these matters is very welcome.